EL834341252US 



REPEAT SEQUENCES OF THE CA125 GENE AND THEIR USE 
FOR DIAGNOSTIC AND THERAPEUTIC INTERVENTIONS 



CROSS REFERENCE TO RELATED APPLICATIONS 

5 This application claims the benefit of U.S. Provisional Application Serial No. 60/284,175 

filed April 17, 2001 and U.S. Provisional Application Serial No. 60/299,380 filed June 19, 2001, 
which are incorporated by reference in their entirety. 

BACKGROUND OF THE INVENTION 

10 The present invention relates generally to the cloning, identification, and expression of 

multiple repeat sequences of the CA125 gene in vitro and, more specifically, to the use of 
recombinant CA125 with epitope binding sites for diagnostic and therapeutic purposes, 
yi CA125 is an antigenic determinant located on the surface of ovarian carcinoma cells with 

m essentially no expression in normal adult ovarian tissue. Elevated in the sera of patients with 

W ovarian adenocarcinoma, CA125 has played a critical role for more than 15 years in the 

H 

W management of these patients relative to their response to therapy and also as an indicator of 
I"' recurrent disease. 

It is well established that CA125 is not uniquely expressed in ovarian carcinoma, but is 
i y also found in both normal secretory tissues and other carcinomas (i.e., pancreas, liver, colon) 
2&i [Hardardottir H et aL, Distribution of CA125 in embryonic tissue and adult derivatives of the 
fetal periderm, AmJObstet Gynecol 163;6(1):1925-1931 (1990); Zurawski VR et al, Tissue 
distribution and characteristics of the CA125 antigen, Cancer Rev. 1 1-12:102-108 (1988); and 
O'Brien TJ et al, CA125 antigen in human amniotic fluid and fetal membranes, AmJObstet 
Gynecol 155:50-55, (1986); Nap M et al, Immunohistochemical characterization of 22 
25 monoclonal antibodies against the CA125 antigen: 2nd report from the ISOBM TD-1 workshop, 
Tumor Biology 17:325-332 (1996)]. Notwithstanding, CA125 correlates directly with the 
disease status of affected patients (i.e., progression, regression, and no change), and has become 
the "gold standard" for monitoring patients with ovarian carcinoma [Bast RC et al, A 
radioimmunoassay using a monoclonal antibody to monitor the course of epithelial ovarian 

30 cancer, NEnglJMed. 309:883-887 (1983); and Bon GC et al, Serum tumor marker 

1 



immunoassays in gynecologic oncology: Establishment of reference values, Am J Obstet. 
Gynecol. 174: 107-1 14 (1996)]. CA125 is especially useful in post-menopausal patients where 
endometrial tissue has become atrophic and, as a result, is not a major source of normal 
circulating CA125. 

5 During the mid 1980's, the inventor of the present invention and others developed Mil, a 

monoclonal antibody to CA125. Ml 1 binds to a dominant epitope on the repeat structure of the 
CA125 molecule [O'Brien TJ et al, New monoclonal antibodies identify the glycoprotein 
carrying the CA125 epitope, Am J Obstet Gynecol 165:1857-64 (1991)]. More recently, the 
inventor and others developed a purification and stabilization scheme for CA125, which allows 
10 for the accumulation of highly purified high molecular weight CA125 [O'Brien TJ et al. More 
than 15 years of CA125: What is known about the antigen, its structure and its function, Int J 
Biological Markers 13(4): 188-195 (1998)]. 

% Considerable progress has been made over the years to further characterize the CAl 25 

;=S molecule, its structure and its function. The CAl 25 molecule is a high molecular weight 
li|l glycoprotein with a predominance of 0-linked sugar side chains. The native molecule exists as a 

[ A very large complex (-2-5 million daltons). The complex appears to be composed of an epitope 
containing CA125 molecule and binding proteins which carry no CA125 epitopes. The CA125 

O molecule is heterogenous in both size and charge, most likely due to continuous deglycosylation 

ffj of the side chains during its hfe-span in bodily fluids. The core CA125 subunit is in excess of 
2g 200,000 daltons, and retains the capacity to bind both 0C125 and Ml 1 class antibodies. While 
the glycoprotein has been described biochemically and metabolically by the inventor of the 
present invention and others, no one has yet cloned the CAl 25 gene, which would provide the 
basis for understanding its structure and its physiologic role in both normal and malignant tissues. 
Despite the advances in detection and quantitation of serum tumor markers like CAl 25, 

25 the majority of ovarian cancer patients are still diagnosed at an advanced stage of the disease- 
Stage III or IV. Further, the management of patients' responses to treatment and the detection of 
disease recurrence remain major problems. There, thus, remains a need to significantly improve 
and standardize current CA125 assay systems. Further, the development of an early indicator of 
risk of ovarian cancer will provide a useful tool for early diagnosis and improved prognosis. 

30 
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SUMMARY OF THE INVENTION 

The CA125 gene has been cloned and multiple repeat sequences as well as the carboxy 
terminus have been identified. CA125 requires a transcript of more than 35,000 bases and 
occupies approximately 150,000 bp on chromosome 19q 13.2. The CA125 molecule comprises 
5 three major domains: an extracellular amino terminal domain (Domain 1); a large multiple repeat 
domain (Domain 2); and a carboxy terminal domain (Domain 3) which includes a transmembrane 
anchor with a short cytoplasmic domain. The amino terminal domain is assembled by combining 
five genomic exons, four very short amino terminal sequences and one extraordinarily large 
exon. This domain is dominated by its capacity for 0-glycosylation and its resultant richness in 
1 0 serine and threonine residues. 

The extracellular repeat domain, which characterizes the CA125 molecule, also represents 
a major portion of the CA125 molecular structure. It is downstream from the amino terminal 
:i domain and presents itself in a much different manner to its extracellular matrix neighbors. 
These repeats are characterized by many features including a highly-conserved nature and a 

tin 

IIH uniformity in exon structure. But most consistently, a cysteine enclosed sequence may form a 
i jl cysteine loop. Domain 2 comprises 156 amino acid repeat units of the CA125 molecule. The 
® repeat domain constitutes the largest proportion of the CA125 molecule. The repeat units also 
13 include the epitopes now well-described and classified for both the major class of CA125 

m antibodies of the OC125 group and the Ml 1 group. More than 60 repeat units have been 
M identified, sequenced, and contiguously placed in the CA125 domain structure. The repeat 
^ sequences demonstrated 70-85% homology to each other. The existence of the repeat sequences 
was confirmed by expression of the recombinant protein in E. coli where both OC125/M1 1 class 
antibodies were found to bind to sites on the CA125 repeat. 

The CA125 molecule is anchored at its carboxy terminal through a transmembrane 
25 domain and a short cytoplasmic tail. The carboxy terminal also contains a proteolytic cleavage 
site approximately 50 amino acids upstream from the transmembrane domain, which allows for 
proteolytic cleavage and release of the CA125 molecule. 

The identification and sequencing of multiple repeat domains of the CA125 antigen 
provides potentially new clinical and therapeutic applications for detecting, monitoring and 
30 treating patients with ovarian cancer and other carcinomas where CA125 is expressed. For 
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example, the ability to express repeat domains of CA125 with the appropriate epitopes would 
provide a much needed standard reagent for research and clinical applications. Current assays for 
CA125 utilize as standards either CA125 produced from cultured cell lines or from patient ascites 
fluid. Neither source is defined with regard to the quality or purity of the CA125 molecule. The 
present invention overcomes the disadvantages of current assays by providing multiple repeat 
domains of CA125 with epitope binding sites. At least one or more of any of the more than 60 
repeats shown in Table 16 can be used as a "gold standard" for testing the presence of CA125. 
Furthermore, new and more specific assays may be developed utilizing recombinant products for 
antibody production. 

Perhaps even more significantly, the multiple repeat domains of CA125 or other domains 
could also be used for the development of a potential vaccine for patients with ovarian cancer, hi 
order to induce cellular and humoral immunity in humans to CA125, murine antibodies specific 
for CA125 were utilized in anticipation of patient production of anti-ideotypic antibodies, thus 
indirectly allowing the induction of an immune response to the CA125 molecule. With the 
availability of recombinant CA125, especially domains which encompass epitope binding sites 
for known murine antibodies, it will be feasible to more directly stimulate patients' immune 
systems to CA125 and, as a result, extend the life of ovarian carcinoma patients. 

The recombinant CA125 of the present invention may also be used to develop therapeutic 
targets. Molecules like CA125, which are expressed on the surface of tumor cells, provide 
potential targets for immune stimulation, drug delivery, biological modifier deHvery or any agent 
which can be specifically delivered to ultimately kill the tumor cells. Humanized or human 
antibodies to CA125 epitopes could be used to deliver all drug or toxic agents including 
radioactive agents to mediate direct killing of tumor cells. Natural ligands having a natural 
binding affinity for domains on the CA125 molecule could also be utilized to deliver therapeutic 
agents to tumor cells. 

CA125 expression may further provide a survival or metastatic advantage to ovarian 
tumor cells. Antisense oligonucleotides derived from the CA125 repeat sequences could be used 
to down-regulate the expression of CA125. Further, antisense therapy could be used in 
association with a tumor cell delivery system of the type described above. 



Recombinant domains of the CA125 molecule also have the potential to identify small 
molecules, which bind to individual domains of the CA125 molecule. These small molecules 
could also be used as dehvery agents or as biological modifiers. 

In one aspect of the present invention, a CA125 molecule is disclosed comprising: (a) an 
5 extracellular amino terminal domain, comprising 5 genomic exons, wherein exon 1 comprises amino 
acids #1-33 of SEQ ID NO: 299, exon 2 comprises amino acids #34-1593 of SEQ ID NO: 299, exon 
3 comprises amino acids #1594-1605 of SEQ ID NO: 299, exon 4 comprises amino acids #1606- 
1617 of SEQ ID NO: 299, and exon 5 comprises amino acids #1618-1637 of SEQ ID NO: 299; (b) a 
multiple repeat domain, wherein each repeat unit comprises 5 genomic exons, wherein exon 1 
10 comprises amino acids #1-42 in any of SEQ ID NOS: 164 through 194; exon 2 comprises amino 

acids #43-65 in any of SEQ ID NOS: 195 through 221; exon 3 comprises amino acids #66-123 in any 
of SEQ ID NOS: 222 through 249; exon 4 comprises amino acids #124-135 in any of SEQ ID NOS: 
a 250 through 277; and exon 5 comprises amino acids #136-156 in any of SEQ ID NOS: 278 through 

298; and (c) a carboxy terminal domain comprising a transmembrane anchor with a short cytoplasmic 
lil domain, and fvirther comprising 9 genomic exons, wherein exon 1 comprises amino acids #1-1 1 of 
%i SEQ ID NO: 300; exon 2 comprises amino acids #12-33 of SEQ ID NO: 300; exon 3 comprises 
S amino acids #34-82 of SEQ ID NO: 300; exon 4 comprises ammo acids #83-133 of SEQ ID NO: 
s 300; exon 5 comprises amino acids #134-156 of SEQ ID NO: 300; exon 6 comprises amino acids 
3 #157-212 of SEQ ID NO: 300; exon 7 comprises amino acids #213-225 of SEQ ID NO: 300; exon 8 
M comprises amino acids #226-253 of SEQ ID NO: 300; and exon 9 comprises amino acids #254-284 
P ofSEQroNO: 300. 

In another aspect of the present invention, the N-glycosylation sites of the amino terminal 
domain marked (x) in Figure 8B are encoded at positions #81, #271, #320, #624, #795, #834, #938, 
and #1,165 in SEQ ID NO: 299. 
25 In another aspect of the present invention, the serine and threonine 0-glycosylation pattern for 

the amino terminal domain is marked (o) in SEQ ID NO: 299 in Figure 8B. 

In another aspect of the present invention, exon 2 in the repeat domain comprises at least 31 
different copies; exon 2 comprises at least 27 different copies; exon 3 comprises at least 28 different 
copies; exon 4 comprises at least 28 different copies, and exon 5 comprises at least 21 different 
30 copies. 
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In another aspect of the present invention, the repeat domain comprises 156 amino acid repeat 
units which comprise epitope binding sites. The epitope binding sites are located in the C-enclosure 
at amino acids #59-79 (marked C-C) in SEQ ID NO: 150 in Figure 5. 

In another aspect, the 156 amino acid repeat unit comprises 0-glycosylation sites at positions 
5 #128, #129, #132, #133, #134, #135, #139, #145, #146, #148, #150, #151, and #156 in SEQ ID NO: 
150 in Figure 5C. The 156 amino acid repeat unit further comprises N-glycosylation sites at 
positions #33 and #49 in SEQ ID NO: 150 in Figure 5C. The repeat unit also includes at least one 
conserved methionine (designated M) at position #24 in SEQ ID NO: 150 in Figure 5C. 

In yet another aspect, the transmembrane domain of the carboxy terminal domain is located at 
10 positions #230-252 (underlined) in SEQ ID NO: 300 of Figure 9B. The cytoplasmic domain of the 
carboxy terminal domain comprises a highly basic sequence adjacent to the transmembrane at 
positions #256-260 in SEQ ID NO: 300 of Figure 9B, serine and threonine phosporylation sites at 
S positions #254, #255, and #276 in SEQ ID NO: 300 in Figure 9B, and tyrosine phosphorylation sites 
^0 at positions #264, #273, and #274 in SEQ ID NO: 300 of Figure 9B. 

HI! In another aspect of the present invention, an isolated nucleic acid of the CAl 25 gene is 

=?t disclosed, which comprises a nucleotide sequence selected from the group consisting of: (a) the 
W nucleotide sequences set forth in SEQ ID NOS: 49, 67, 81, 83-145, 147, 150, and 152; (b) a 
O nucleotide sequence having at least 70% sequence identity to any one of the sequences in (a); (c) a 
J degenerate variant of any one of (a) to (b); and (d) a fragment of any one of (a) to (c) . 
M In another aspect of the present invention, an isolated nucleic acid of the CA125 gene, 

P comprising a sequence that encodes a polypeptide with the amino acid sequence selected from the 
group consisting of: (a) the amino acid sequences set forth in SEQ ID NOS: 1 1-47, 50-80, 82, 146, 
148, 149, 151, and 153-158; (b) an amino acid sequence having at least 50% sequence identity to any 
one of the sequences in (a); (c) a conservative variant of any one of (a) to (b); and (d) a fragment of 
25 any one of (a) to (c). 

In yet another aspect, a vector comprising the nucleic acid of the CAl 25 gene is disclosed. 
The vector may be a cloning vector, a shuttle vector, or an expression vector. A cultured cell 
comprising the vector is also disclosed. 

In yet another aspect, a method of expressing CA125 antigen in a cell is disclosed, comprising 
30 the steps of: (a) providing at least one nucleic acid comprising a nucleotide sequence selected from 
the group consisting of: (i) the nucleotide sequences set forth in SEQ ID NOS: 49, 67, 81, 83-145, 
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147, 150, and 152; (ii) a nucleotide sequence having at least 70% sequence identity to any one of the 
sequences in (i); (iii) a degenerate variant of any one of (i) to (ii); and (iv) a fragment of any one of 
(i) to (iii); (b) providing cells comprising an mRNA encoding the CA125 antigen; and (c) introducing 
the nucleic acid into the cells, wherein the CA125 antigen is expressed in the cells. 
5 In yet another aspect, a purified polypeptide of the CA125 gene, comprising an amino acid 

sequence selected from the group consisting of: (a) the amino acid sequences set forth in SEQ ID 
NOS: 11-48, 50, 68-80, 82, 146, 148, 149, 150, 151, and 153-158; (b) an amino acid sequence having 
at least 50% sequence identity to any one of the sequences in (a); (c) a conservative variant of any 
one of (a) to (b); and (d) a fragment of any one of (a) to (c). 
1 0 In another aspect, a purified antibody that selectively binds to an epitope in the receptor- 

binding domain of CA125 protein, wherein the epitope is within the amino acid sequence selected 
from the group consisting of: (a) the amino acid sequences set forth in SEQ ID NOS: 1 1-48, 50, 68- 
□ 80, 146, 15 1, and 153-158; (b) an amino acid sequence having at least 50% sequence identity to any 
Jjj one of the sequences in (a); (c) a conservative variant of any one of (a) to (b); and (d) a fragment of 
iff' any one of (a) to (c). 

%j A diagnostic for detecting and monitoring the presence of C Al 25 antigen is also disclosed, 

i which comprises recombinant CA125 comprising at least one repeat unit of the CA125 repeat domam 
» including epitope binding sites selected from the group consisting of amino acid sequences set forth 
% in SEQ ID NOS: 11-48, 50, 68-80, 82, 146, 150, 151, 153-161, and 162 (amino acids #1,643-11,438). 
M A therapeutic vaccine to treat mammals with elevated CA125 antigen levels or at risk of 

"''4 

Q developing a disease or disease recurrence associated with elevated CA125 antigen levels is also 

■ " disclosed. The vaccine comprises recombinant CA125 repeat domains including epitope binding 
sites, wherein the repeat domains are selected from the group of amino acid sequences consisting of 
SEQ ID NOS: 11-48, 50, 68-80, 82, 146, 148, 149, 150, 151, 153-161, and 162 (amino acids #1,643- 

25 1 1,438), and amino acids #175-284 of SEQ ID NO: 300. Mammals include animals and humans. 

In another aspect of the present invention, an antisense ohgonucleotide is disclosed that 
inhibits the expression of CA 125 encloded by: (a) the nucleotide sequences set forth in SEQ ID 
NOS: 49, 67, 81, 83-145, 147, 150, and 152; (b) a nucleotide sequence having at least 70% sequence 
identity to any one of the sequences in (a); (c) a degenerate variant of any one of (a) to (b); and (d) a 

3 0 fragment of any one of (a) to (c) . 
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The preceeding and further aspects of the present invention will be apparent to those of 
ordinary skill in the art from the following description of the presently preferred embodiments of 
the invention, such description being merely illustrative of the present invention. 

5 BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 illustrates the cyanogen bromide digested products of CA125 on Western blot 
probed with Mil and OC125 antibodies. Table 1 shows the amino acid sequence derived from the 
amino terminal end of the 40 kDa cyanogen bromide peptide along with internal sequences obtained 
after protease digestion of the 40 kDa fragment (SEQ ID NOS: 1-4). SEQ ID NO: 1 is the amino 
10 terminal sequence derived of the 40 kDa peptide and SEQ ID NOS: 2, 3, and 4 reflect internal amino 
acid sequences derived from peptides after protease digestion of the 40 kDa fragment. Table 1 
further provides a translation of the EST (BE005912) with homologous sequences (SEQ ID NOS: 5 
Q and 6) either boxed or underUned. Protease cleavage sites are indicated by arrows. 
J5 Figure 2A illustrates PGR amplification of products generated from primers utilizing the EST 

iS sequence referred to in Figure 1, the amino acid sequence obtained from the 40 kDa fragment and 
H EST sequence AA# 640762. Lane 1-2: normal; 3: serous ovarian carcinoma; 4: serous ovarian 
S carcinoma; 5: mucinous ovarian carcinoma; 6: p-tubulin control. The anticipated size band 400 b is 

present in lane 3 and less abundantly in lane 4. 
k5 Figure 2B illustrates the RT-PCR that was performed to determine the presence or absence of 

j§ CA125 transcripts in primary culture cells of ovarian tumors. This expression was compared to 
0 tubulin expression as an internal control. Lanes 1, 3, 5, 7, and 9 represent the primary ovarian tumor 
cell lines. Lanes 2, 4, 6, and 8 represent peripheral blood mononuclear cell lines derived from the 
corresponding patients in lanes 1, 3, 5, and 7. Lane 10 represents fibroblasts from the patient tumor 
in lane 9. Lanes 11 and 12 are CaOV3 and a primary tumor specimen, respectively. 
25 Figure 3 illustrates repeat sequences determined by sequencing cloned cDNA from the 400 b 

band in Figure 2B. Placing of repeat sequences in a contiguous fashion was accomplished by PGR 
ampUfication and sequencing of overlap areas between two repeat sequences. A sample of the 
complete repeat sequences is shown in SEQ ID NOS: 158, 159, 160, and 161, which was obtained in 
this manner and placed next to each other based on overiap sequences. The complete list of repeat 
30 sequences that was obtained is shown in Table 21 (SEQ ID NO: 162). 
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Figure 4 illustrates three Western immunoblot patterns: Panel A = probed with Mil, Panel B 
= probed with OC125 and Panel C = probed with antibody ISOBM 9.2. Each panel represents E. coli 
extracts as follows: lane 1 = E. coli extract from bacteria with the plasmid PQE-30 only. Lane 1=E. 
coli extract from bacteria with the plasmid PQE-30 which includes the CA125 repeat unit. Lane 3 = 
5 E. coli exfract from bacteria with the plasmid PQE-30 which includes the TADG-14 protease 

unrelated to CA125. Panel D shows a Coomassie blue stain of a PAGE gel of £. coli extract derived 
from either PQE-30 alone or from bacteria infected with PQE-30 - CA125 repeat (recombinant 
CA125 repeat). 

Figure 5 represents Western blots of the CA125 repeat sequence that were generated to 

1 0 determine the position of the Ml 1 epitope within the recombinant C Al 25 repeat. The expressed 
protein was bound to Ni-NTA agarose beads. The protein was left undigested or digested with 
Asp-N or Lys-C. The protein remaining bound to the beads was loaded into lanes 1, 2, or 3 

Q corresponding to undigested, Asp-N digested and Lys-C digested, respectively. The supematants 
% from the digestions were loaded in lanes 4, 5, and 6 corresponding to undigested, Asp-N digested and 
if Lys-C digested, respectively. The blots were probed with either anti-His tag antibody (A) or Ml 1 
H antibody (B). Panel C shows a typical repeat sequence corresponding to SEQ ID NO: 1 50 with 
ff! each exon defined by arrows. All proteolytic aspartic acid and lysine sites are marked with 
overhead arrow or dashes. In the lower panel, the 0-glycosylation sites m exons 4 and 5 are 
3 marked with O, the N-glycosylation sites are marked with X plus the amino acid number in the 

11 repeat (#12, 33, and 49) the conserved methionine is designated with M plus the amino acid 

rf number (M#24), and the cysteine enclosure which is also present m all repeats and encompasses 
19 amino acids between the cysteines is marked with C-C (amino acids #59-79). The epitopes 
for Ml 1 and OC125 are located in the latter part of the C-enclosure or downsfream from the C- 
enclosure. 

25 Figure 6 illusfrates a Northem blot analysis of RNA derived from either normal ovary (N) or 

ovarian carcinoma (T) probed with a P^^ cDNA repeat sequence of CA125. Total RNA samples 
(lO^g) were size separated by elecfrophoresis on a formaldehyde 1.2% agarose gel. After blotting to 
Hybond N, the lanes were probed with P^^ radiolabelled 400 bp repeat (see Figure 2). Lane 1 
represents RNA from normal ovarian tissue, and lane 2 represents RNA from serous ovarian tumor 

30 tissue. 
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Figure 7A is a schematic diagram of a typical repeat unit for CA125 showing the N- 
glycosylation sites at the amino end and the totally conserved methionine (M). Also shown is the 
proposed cysteine enclosed loop with antibody binding sites for OC125 and Ml 1 . Also noted are the 
highly 0-glycosylated residues at the carboxy end of the repeat. 

Figure 7B represents the genomic structure and exon configuration of a 156 amino acid repeat 
sequence of CA125 (SEQ ID NO: 163), which comprises a standard repeat unit. 

Figure 7C lists the individual known sequences for each exon, which have been determined as 
follows: Exon 1 - SEQ ID NOS: 164-194; Exon 2 - SEQ ID NOS: 195-221; Exon 3 - SEQ ID NOS: 
222-249; Exon 4 - SEQ ID NOS: 250-277; and Exon 5 - SEQ ID NOS: 278-298. 

Figure 8 A shows the genomic structure of the amino terminal end of the CA125 gene. It also 
indicates the amino composition of each exon in the extracellular domaui. 

Figure SB illustrates the amino acid composition of the amino terminal domain (SEQ ID NO: 
299) with each potential 0-glycosylation site marked with a superscript (o) and N-glycosylation sites 
marked with a superscript (x). T-TALK sequences are underlined. 

Figure 9 A illustrates the genomic exon structure of the carboxy-terminal domain of the 
CA125 gene. It includes a diagram showing the extracellular portion, the potential cleavage site, the 
transmembrane domain and the cytoplasmic tail. 

Figure 9B illustrates the amino acid composition of the carboxy terminal domain (SEQ ID 
NO: 300) including the exon boundaries, 0-glycosylation sites (o), and N-glycosylation sites (x). 
The proposed transmembrane domain is imderlined. 

Figure 10 illustrates the proposed structure of the CA125 molecule based on the open reading 
frame sequence described herein. As shown, the molecule is dominated by a major repeat domain in 
the extracellular space along with a highly glycosylated amino terminal repeat. The molecule is 
anchored by a transmembrane domain and also includes a cytoplasmic tail with potential for 
phosphorylation. 

DETAILED DESCRIPTION OF THE INVENTION 

In accordance with the present invention, conventional molecular biology, microbiology, 
and recombinant DNA techniques may be used that will be apparent to those skilled in the 
relevant art. Such techniques are explained fully in the literature (see, e.g., Maniatis, Fritsch & 
Sambrook, "Molecular Cloning: A Laboratory Manual (1982); "DNA Cloning: A Practical 
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Approach," Volumes I and II (D. N. Glover ed. 1985); "Oligonucleotide Synthesis" (M. J. Gait 
ed. 1984); "Nucleic Acid Hybridization" (B. D. Hames & S. J. Higgins eds. (1985)); 
"Transcription and Translation" (B. D. Hames & S. J. Higgins eds. (1984)); "Animal Cell 
Culture" (R. I. Freshney, ed. (1986)); "Immobilized Cells And Enzymes" (IRL Press, (1986)); 
5 and B. Perbal, "A Practical Guide To Molecular Cloning" ( 1 984)). 

Therefore, if appearing herein, the following terms shall have the definitions set out 

below. 

A "vector" is a rephcon, such as plasmid, phage or cosmid, to which another DNA 
segment may be attached so as to bring about the replication of the attached segment. 

1 0 A "DNA molecule" refers to the polymeric form of deoxyribonucleotides (adenine, 
guanine, thymine, or cytosine) in either single stranded form, or a double- stranded helix. This 
term refers only to the primary and secondary structure of the molecule, and does not limit it to 

S any particular tertiary forms. Thus, this term includes double-stranded DNA found, inter alia, in 
m linear DNA molecules (e.g., restriction fragments), viruses, plasmids, and chromosomes. 
iS As used herein, the term "gene" shall mean a region of DNA encoding a polypeptide 

chain. 

r "Messenger RNA" or "mRNA" shall mean an RNA molecule that encodes for one or 

*K more polypeptides. 

i y "DNA polymerase" shall mean an enzyme which catalyzes the polymerization of 

11 deoxyribonucleotide triphosphates to make DNA chains using a DNA template. 

' "Reverse transcriptase" shall mean an enzyme which catalyzes the polymerization of 

deoxy- or ribonucleotide tiiphosphates to make DNA or RNA chains using an RNA or DNA 
template. 

"Complementary DNA" or "cDNA" shall mean the DNA molecule synthesized by 
25 polymerization of deoxyribonucleotides by an enzyme with reverse h-anscriptase activity. 

An "isolated nucleic acid" is a nucleic acid the structure of which is not identical to that of 
any naturally occurring nucleic acid or to that of any fragment of a naturally occurring genomic 
nucleic acid spanning more than three separate genes. The term therefore covers, for example, 
(a) a DNA which has the sequence of part of a naturally occurring genomic DNA molecule but is 
30 not flanked by both of the coding sequences that flank that part of the molecule in the genome of 
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the organism in which it naturally occurs; (b) a nucleic acid incorporated into a vector or into the 
genomic DNA of a prokaryote or eukaryote in a manner such that the resulting molecule is not 
identical to any naturally occurring vector or genomic DNA; (c) a separate molecule such as a 
cDNA, a genomic fragment, a fragment produced by polymerase chain reaction (PGR), or a 
restriction fragment; and (d) a recombinant nucleotide sequence that is part of a hybrid gene, i.e., 
a gene encoding a fusion protein. 

"Oligonucleotide", as used herein in referring to the probes or primers of the present 
invention, is defined as a molecule comprised of two or more deoxy- or ribonucleotides, 
preferably more than ten. Its exact size will depend upon many factors which, in turn, depend 
upon the ultimate function and use of the oligonucleotide. 

"DNA fragment" includes polynucleotides and/or oligonucleotides and refers to a plurality 
of joined nucleotide units formed from naturally-occurring bases and cyclofuranosyl groups 
joined by native phosphodiester bonds. This term effectively refers to naturally- occurring 
species or synthetic species formed from naturally-occurring subunits. "DNA fragment" also 
refers to purine and pyrimidine groups and moieties which function similarly but which have non 
naturally-occurring portions. Thus, DNA fragments may have altered sugar moieties or inter- 
sugar linkages. Exemplary among these are the phosphorothioate and other sulfur containing 
species. They may also contain altered base units or other modifications, provided that biological 
activity is retained. DNA fragments may also include species which include at least some 
modified base forms. Thus, purines and pyrimidines other than those normally found in nature 
may be so employed. Similarly, modifications on the cyclofuranose portions of the nucleotide 
subunits may also occur as long as biological function is not eliminated by such modifications. 

"Primer" shall refer to an oligonucleotide, whether occurring naturally or produced 
synthetically, which is capable of acting as a point of initiation of synthesis when placed under 
conditions in which synthesis of a primer extension product, which is complementary to a nucleic 
acid strand, is induced, i.e., in the presence of nucleotides and an inducing agent such as a DNA 
polymerase and at a suitable temperature and pH. The primer may be either single-stranded or 
double-stranded and must be sufficiently long to prime the synthesis of the desired extension 
product in the presence of the inducing agent. The exact length of the primer will depend upon 
many factors, including temperature, the source of primer and the method used. For example, for 
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diagnostic applications, depending on the complexity of the target sequence, the oligonucleotide 
primer typically contains 10-25 or more nucleotides, although it may contain fewer nucleotides. 

The primers herein are selected to be "substantially" complementary to different strands of 
a particular target DNA sequence. This means that the primers must be sufficiently 
5 complementary to hybridize with their respective strands. Therefore, the primer sequence need 
not reflect the exact sequence of the template. For example, a non-complementary nucleotide 
fragment may be attached to the 5' end of the primer, with the remainder of the primer sequence 
being complementary to the strand. Alternatively, non-complementary bases or longer sequences 
can be interspersed into the primer, provided that the primer sequence has sufficient 
10 complementarity with the sequence or hybridize therewith and thereby form the template for the 
synthesis of the extension product. 

As used herein, the term "hybridization" refers generally to a technique wherein denatured 
!' n RNA or DNA is combined with complementary nucleic acid sequence which is either free in 
* solution or bound to a sohd phase. As recognized by one skilled in the art, complete 
lifl complementarity between the two nucleic acid sequences is not a pre-requisite for hybridization 
fj to occur. The technique is ubiquitous in molecular genetics and its use centers around the 
^ identification of particular DNA or RNA sequences within complex mixtures of nucleic acids. 
KJ As used herein, "restriction endonucleases" and "restriction enzymes" shall refer to 

5 bacterial enzymes which cut double-stranded DNA at or near a specific nucleotide sequence, 
al "Purified polypeptide"refers to any peptide generated from CA125 either by proteolytic 

cleavage or chemical cleavage. 

"Degenerate variant" refers to any amino acid variation in the repeat sequence, which 
fulfills the homology exon structure and conserved sequences and is recognized by the Ml 1 , 
OC125 and ISOBM series of antibodies. 
25 "Fragmenf refers to any part of the CA125 molecule identified in a purification scheme. 

"Conservative variant antibody" shall mean any antibody that fulfills the criteria of Ml 1, 
OC125 or any of the ISOBM antibody series. 

30 
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MATERIALS AND METHODS 
A. Tissue collection, RNA Isolation and cDNA Synthesis 

Both normal and ovarian tumor tissues were utilized for cDNA preparation. Tissues were 
routinely collected and stored at -80''C according to a tissue collection protocol. 

5 Total RNA isolation was performed according to the manufacturer's instructions using the 

TriZol Reagent purchased from GibcoBRL (Catalog #15596-018). In some instances, mRNA 
was isolated using oligo dT affinity chromatography. The amount of RNA recovered was 
quantitated by UV spectrophotometry. First strand complementary DNA (cDNA) was 
synthesized using 5.0 ^ig of RNA and random hexamer primers according to the manufacturer's 
10 protocol utilizing a first strand synthesis kit obtained from Clontech (Catalog #K1402- 1). The 
purity of the cDNA was evaluated by PCR using primers specific for the I3-tubulin gene. These 
primers span an intron such that the PCR products generated from pure cDNA can be 

3 distinguished from cDNA contaminated with genomic DNA. 

S B. Identification and Ordering of CA125 Repeat Units 

Ipl It has been demonstrated that the 2-5 milhon dalton CA125 glycoprotein (with repeat 

iij domains) can be chemically segmented into glycopeptide fragments using cyanogen bromide. As 

shown in Figure 1 , several of these fragments, in particular the 40 kDa and 60 kDa fragments, 
0 still bind to the to the two classical antibody groups defined by DC 125 and Ml 1 . 
fij To convert CA125 into a consistent glycopeptide, the CA125 parent molecule was 

2:1 processed by cyanogen bromide digestion. This cleavage process resulted in two main fractions 
^* on coramassie blue staining following polyacrylamide gel elecfrophoresis. An approximately 60 
kDa band and a more dominant 40 kDa band were identified as shown in Figure 1 . When a 
Westem blot of these bands was probed with either OC125 or Ml 1 antibodies (both of which 
define the CA125 molecule), these bands bound both antibodies. The 40 kDa band was 
25 significantly more prominent than the 60 kDa band. These data thus established the likelihood of 
these bands (most especially the 40 kDa band) as being an authentic cleavage peptide of the 
CA125 molecule, which retained the identifying characteristic of OC125 and Ml 1 binding. 

The 40 kDa and 60 kDa bands were excised from PVDF blots and submitted to amino 
terminal and internal peptide amino acid sequencing as described and practiced by Harvard 
30 Sequencing , (Harvard Microchemistry Facility and The Biological Laboratories, 1 6 Divinity 
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Avenue, Cambridge, Massachusetts 02138). Sequencing was successful only for the 40 kDa 
band where both amino terminal sequences and some internal sequences were obtained as shown 
in Table 1 at SEQ ID NOS: 1-4. The 40 kDa fragment of the CA125 protein was found to have 
homology to two translated EST sequences (GenBank Accession Nos. BE005912 and 
5 AA640762). Visual examination of these translated sequences revealed similar amino acid 
regions, indicating a possible repetitive domain. The nucleotide and amino acid sequences for 
EST Genbank Accession No. BE005912 (corresponding to SEQ ID NO: 5 and SEQ ID NO: 6, 
respectively) are illustrated in Table 1. Common sequences are boxed or underiined. 

In an attempt to identify other individual members of this proposed repeat family, two 
1 0 oligonucleotide primers were synthesized based upon regions of homology in these EST 
sequences. Shown in Table 2A, the primer sequences correspond to SEQ ID NOS: 7 and 8 
(sense primers) and SEQ ID NOS: 9 and 10 (antisense primers). Repeat sequences were 
3 amplified in accordance with the methods disclosed in the following references: Shigemasa K et 
% ah, p21 : A monitor of p53 dysfunction in ovarian neoplasia, Int. J. Gynecol. Cancer 7:296-303 
ijl (1997) and Shigemasa K et al, pl6 Overexpression: A potential eariy indicator of transformation 

1,3 in ovarian carcinoma, J. Soc. Gynecol. Invest. 4:95-102 (1997). Ovarian tumor cDNA obtained 

rn 

r from a tumor cDNA bank was used. 

0 Amplification was accompHshed in a Thermal Cycler (Perkin-Elmer Cetus). The reaction 

f y mixture consisted of lU Taq DNA Polymerase in storage buffer A (Promega), IX Thermophilic 
aS DNA Polymerase 1 OX Mg free buffer (Promega), 300mM dNTPs, 2.5mM MgC12, and 0.25mM 
■ * each of the sense and antisense primers for the target gene. A 20 ^1 reaction included 1 [d of 
cDNA synthesized from 50ng of mRNA from serous tumor mRNA as the template. PCR 
reactions required an initial denaturation step at 94°C/1.5 min. followed by 35 cycles of 94°C/0.5 
min., 48°C/0.5 min., 72°C/0.5 min. with a final extension at 72°C/7 min. Three bands were 
25 initially identified (»400 bp, »800 bp, and »1200 bp) and isolated. After size analysis by agarose 
gel electrophoresis, these bands as well as any other products of interest were then hgated into a 
T-vector plasmid (Promega) and transformed into competent DH5a strain of E. coli cells. After 
growth on selective media, individual colonies were cultured overnight at 37°C, and plasmid 
DNA was extracted using the QIAprep Spin Miniprep kit (Qiagen). Positive clones were 
30 identified by restriction digests using Apa I and Sac I. Inserts were sequenced using an ABI 
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automatic sequencer, Model 377, T7 primers, and a Big Dye Terminator Cycle Sequencing Kit 
(Applied Biosystems). 

Obtained sequences were analyzed using the Pileup program of the Wisconsin Genetic's 
Computer Group (GCG). Repeat units were ordered using primers designed against two highly 
5 conserved regions within the nucleotide sequence of these identified repeat units. Shown in 
Table 2B, the sense and antisense primers (5'-GTCTCTATGTCAATGGTTTCACCC-3' / 5'- 
TAGCTGCTCTCTGTCCAGTCC-3' SEQ IDNOS: 301 and 302, respectively) faced away from 
one another within any one repeat creating an overlap sequence, thus enabling amplification 
across the junction of any two repeat units. PCR reactions, cloning, sequencing, and analysis 
10 were performed as described above. 

C . Identification and Assembly of the C Al 25 Amino Terminal Domain 

In search of open reading frames containing sequences in addition to CA125 repeat units, 
w database searches were performed using the BLAST program available at the National Center for 
jj Biotechnology Information (www.ncbi.nlm.nih.gov/). Using a repeat unit as the query sequence, 
IS ! cosmid AC008734 was identified as having multiple repeat sequences throughout the unordered (3 5) 
i,y contiguous pieces of DNA, also known as contigs. One of these contigs, #32, was found to have 
T exons 1 and 2 of a repeat region at its 3' end. Contig#32 was also found to contain a large open 
^ J reading frame (ORF) upstream of the repeat sequence. PCR was again used to verify the existence of 
f i! this ORF and confirm its connection to the repeat sequence. The specific primers recognized the 3 ' 
2I end of this ORF (5'-CAGCAGAGACCAGCACGAGTACTC-3')(SEQ ID NO: 51) and sequence 
within the repeat (5'-TCCACTGCCATGGCTGAGCT-3')(SEQ ID NO: 52). The remainder of the 
amino-terminal domam was assembled from this contig in a similar manner. With each PCR 
confirmation, a new primer (see Table lOA) was designed against the assembled sequence and used 
in combination with a primer designed against another upsfream potential ORF (Set 1:5'- 
25 CCAGCACAGCTCTTCCCAGGAC-3' / 5'-GGAATGGCTGAGCTGACGTCTG-3'(SEQ ID NO: 
53 and SEQ ID NO: 54); Set 2: 5'-CTTCCCAGGACAACCTCAAGG-3' / 5'- 
GCAGGATGAGTGAGCCACGTG-3'(SEQ ID NO: 55 and SEQ ID NO: 56); Set 3: 5'- 
GTCAGATCTGGTGACCTCACTG-3' / 5'-GAGGCACTGGAAAGCCCAGAG-3')(SEQ ID NO: 
57 and SEQ ID NO: 58). Potential adjoining sequence (contig #7 containing EST AU133673) was 
30 also identified using contig #32 sequence as query sequence in database searches. Confirmation 
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primers were designed and used in a typical manner (5'-CTGATGGCATTATGGAACACATCAC-3' 
/ 5'-CCCAGAACGAGAGACCAGTGAG-3')(SEQ ID NO: 59 and SEQ ID NO: 60). 

In order to identify the 5' end of the CA125 sequence, 5' Rapid AmpHfication of cDNA Ends 
(FirstChoice™ RLM-RACE Kit, Ambion) was performed using tumor cDNA . The primary PGR 
reaction used a sense primer supphed by Ambion (5'-GCTGATGGCGATGAATGAACACTG-3') 
(SEQ ID NO: 61) and an anti-sense primer specific to confirmed contig #32 sequence (5'- 
CCCAGAACGAGAGACCAGTGAG-3')(SEQ ID NO: 62). The secondary PGR was then 
performed using nested primers, sense from Ambion (5'- 

CGCGGATCCGAACACTGCGTTTGCTGGCTTTGATG-3') (SEQ ID NO: 63) and the anti-sense 
was specific to confirmed contig #7 sequence (5'-CCTCTGTGTGCTGCTTCATTGGG-3')(SEQ ID 
NO: 64). The RACE PGR product (a band of approximately 300 bp) was cloned and sequenced as 
previously described. 

D, Identification and Assembly of the CA125 Carboxy Terminal Domain 

Database searches using confirmed repeat units as query also identified a cDNA sequence 
(GenBank AK024365) containing other repeat units, but also a potential carboxy terminal sequence. 
The contiguous nature of this sequence with assembled CA125 was confirmed using PGR (5'- 
GGACAAGGTGACGAGACTCTAC-3' / 5'-GGAGATCGTCCAGGTCTAGGTGTG-3'), (SEQ ID 
NO: 303 and SEQ ID NO: 304, respectively) as well as contig and EST analysis. 

E. Expression of 6xHis-tagged CA125 repeat in E. coli 

The open reading firame of a CA125 repeat shown in Table 1 1 was amplified by PGR with the 
sense primer (5'-ACCGGATGGATGGGGGAGACAGAGGGTGGGGG-3') (SEQ ID NO: 65) the 
antisense primer (5'-TGTAAGCTTAGGGAGGGAGGATGGAGTGG-3') (SEQ ID NO: 66) PGR 
was performed in a reaction mixture consisting of ovarian tumor cDNA derived firom 50 ng of 
mRNA, 5 pmol each of sense and antisense primers for the GA125 repeat, 0,2 mmol of dNTPs, and 
0.625 U of Taq polymerase in Ix buffer in a final volume of 25 ml. This mixture was subjected to 1 
minute of denaturation at 95°G followed by 30 cycles of PGR consisting of the following: 
denaturation for 30 seconds at 95°G, 30 seconds of annealing at 62°G, and 1 minute of extension at 
72°G with an additional 7 minutes of extension on the last cycle. The product was electrophoresed 
through a 2% agarose gel for separation. The PGR product was purified and digested with the 
restriction enzymes Bam HI and Hind IIL This digested PGR product was then ligated into the 
expression vector pQE-30, which had also been digested with Bam HI and Hind III This clone 
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would allow for expression of recombinant 6xHis-tagged CA125 repeat. Transformed E. coli 
(JM109) were grown to an OD600 of 1.5-2.0 at 37°C and then induced with IPTG (0.1 mM) for 4-6 
hours at 25°C to produce recombinant protein. Whole E. coli lysate was electrophoresed through a 
12% SDS polyacrylamide gel and Coomassie stained to detect highly expressed proteins. 

5 F. Western Blot Analysis 

Proteins were separated on a 12% SDS-PAGE gel and electroblotted at lOOV for 40 
minutes at 4°C to nitrocellulose membrane. Blots were blocked overnight in phosphate-buffered 
saline (PBS) pH 7.3 containing 5% non-fat milk. CA125 antibodies Ml 1, OC125, or ISOBM 9.2 
were incubated with the membrane at a dilution of 5^ig/ml in 5% milk/PBS-T (PBS plus 0.1% 
10 TX-lOO) and incubated for 2 hours at room temperature. The blot was washed for 30 minutes 
with several changes of PBS and incubated with a 1:10,000 dilution of horseradish peroxidase 
(HRP) conjugated goat anti-mouse IgG antibody (Bio-Rad) for 1 hour at room temperature. 

^ 0 Blots were washed for 30 minutes with several changes of PBS and incubated with a 

f S chemiluminescent substrate (ECL from Amersham Pharmacia Biotech) before a 1 0-second 

^ \ exposure to X-ray film for visualization. 

W Figure 4 illustrates three Western immunoblot patterns of the recombinant CA125 repeat 

7 purified from E. coli lysate (lane 2) compared to E. coli lysate with no recombinant protein (lane 
% 1 -negative control) and a recombinant protein T ADO- 1 4 which is unrelated to CA 1 25 (lane 3) . 

As shown, the Ml 1 antibody, the OC125 antibody and the antibody ISOBM 9.2 (an OC125-like 
M antibody) all recognized the CA125 recombinant repeat (lane 2), but did not recognize either the 

E. coli lysate (lane 1) or the unrelated TADG-14 recombinant (lane 3). These data confirm that 

the recombinant repeat encodes both independent epitopes for CA125, the OC125 epitope and the 

Ml 1 epitope. 

G. Northern Blot Analysis 

25 Total RNA samples (approximately 1 O^ig) were separated by electrophoresis through a 

6.3% formaldehyde, 1.2% agarose gel in 0.02 M MOPS, 0.05 M sodium acetate (pH 7.0), and 
0.001 M EDTA. The RNAs were then blotted to Hybond-N (Amersham) by capillary action in 
20x SSPE and fixed to the membrane by baking for 2 hours at 80°C. A PGR product 
representing one 400 bp repeat of the CA125 molecule was radiolabelled using the Prime-a-Gene 

30 Labeling System available from Promega (cat. #U1 100). The blot was probed and stripped 
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according to the ExpressHyb Hybridization Solution protocol available from Clontech (Catalog 
#8015-1). 

RESULTS 

In 1997, a system was described by a co-inventor of the present invention and others for 
5 purification of CA125 (primarily from patient ascites fluid), which when followed by cyanogen 
bromide digestion, resulted in peptide fragments of CA125 of 60 kDa and 40 kDa [O'Brien TJ et al, 
More than 15 years of CA125: What is known about the antigen, its structure and its ftinction, IntJ 
Biological Markers 13(4)188-195 (1998)]. Both fragments were identifiable by commassie blue 
staining on polyacrylamide gels and by Western blot. Both fragments were shown to bind both 
10 OC 125 and Ml 1 antibodies, indicating both major classes of epitopes were preserved in the released 
peptides (Figure 1). 

Protein sequencing of the 40 kDa band yielded both amino terminal sequences and some 
^'^f internal sequences generated by protease digestion (Table 1 - SEQ ID NOS: 1-4). Insufficient yields 

of the 60 kDa band resulted in unreliable sequence information. Unfortunately, efforts to amplify 
IS PCR products utilizing redundant primers designed to these sequences were not successfiiL In mid 
'^J 2000, an EST (#BE005912) was entered into the GCG database, which contained homology to the 40 
m kDa band sequence as shown in Table 1 (SEQ ID NOS: 5 and 6). The translation of this EST 

indicated good homology to the amino terminal sequence of the 40 kDa repeat (e.g. PGSREIFKTTE) 

y3 with only one amino acid difference (i.e. an asparagine is present instead of phenylalanine in the EST 

fl I 

2S sequence). Also, some of the internal sequences are partially conserved (e.g. SEQ ID NO: 2 and to a 
lesser extent, SEQ ED NO: 3 and SEQ ID NO: 4). More importantly, all the internal sequences are 
preceded by a basic amino acid (Table 1, indicated by arrows) appropriate for proteolysis by the 
trypsin used to create the internal peptides from the 40 kDa cyanogen bromide repeat. Utilizing the 
combined sequences, those obtained by amino acid sequencing and those identified in the EST 

25 (#BE005912) and a second EST (#AA640762) identified in the database, sense primers were created 
as follows: 5'-GGA GAG GGT TCT GCA GGG TC-3' (SEQ ID NO: 7) representing amino acids 
ERVLQG and anti-sense primer, 5' GTG AAT GGT ATC AGG AGA GG-3' (SEQ ID NO: 9) 
representing PLLIPF, Using PCR, the presence of transcripts was confirmed representing these 
sequences in ovarian tumors and their absence in normal ovary and either very low levels or no 

30 detectable levels in a mucinous tumor (Figure 2A). The existence of transcripts was fiirther 
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confirmed in cDNA derived from multiple primary ovarian carcinoma cell lines and the absence of 
transcripts in matched lymphocyte cultures from the same patient (Figure 2B). 

After cloning and sequencing of the amplified 400 base pair PGR products, a series of 
sequences were identified, which had high homology to each other but which were clearly distinct 
5 repeat entities (Figure 3) (SEQ ID NOS: 158 through 161). 

Examples of each category of repeats were sequenced, and the results are shown in Tables 
3, 4, and 5. The sequences represent amplification and sequence data of PGR products obtained 
using oligonucleotide primers derived from an EST (Genbank Accession No. BE005912). Table 
3 illusfrates the amino acid sequence for a 400 bp repeat in the CA125 molecule, which is 
10 identified as SEQ ID NO: 1 1 through SEQ ID NO: 2 1 . Table 4 illusfrates the amino acid 
sequence for a 800 bp repeat in the CA125 molecule, which corresponds to SEQ ID NO: 22 
through SEQ ID NO: 35. Table 5 illusfrates the amino acid sequence for a 1200 bp repeat in the 
5 CA125 molecule, which is identified as SEQ ID NO: 36 through SEQ ID NO: 46. Assembly of 
i=D these repeat sequences (which showed 75-80% homology to each other as determined by GCG 
l3l Software (GCG = Genetics Computer Group) using the Pileup application) utilizing PGR 
; 1 amplification and sequencing of overlapping sequences allowed for the construction of a 9 repeat 
structure. The amino acid sequence for the 9 repeat is shown in Table 6 as SEQ ID NO: 47. The 
£3 individual C-enclosures are highlighted ui the table. 

?u Using the assembled repeat sequence in Table 6 to search genebank databases, a cDNA 

2^ sequence referred to as Genbank Accession No. AK024365 (entered on 9/29/00) was discovered. 
Table 7 shows the amino acid sequence for AK024365, which corresponds to SEQ ID NO: 48. 
AK024365 was found to overlap with two repeats of the assembled repeat sequence shown in 
Table 6. Individual C-enclosures are highlighted in Table 7. 

The cDNA for AK024365 allowed aUgnment of four additional repeats as well as a 
25 downsfream carboxy terminus sequence of the CA125 gene. Table 8 illusfrates the complete 
DNA sequence of 13 repeats contiguous with the carboxy terminus of the CA125 molecule, 
which corresponds to SEQ ID NO: 49. Table 9 illustrates the complete amino acid sequence of 
the 13 repeats and the carboxy terminus of the CA125 molecule, which corresponds to SEQ ID 
NO: 50. The carboxy terminus domain was further confirmed by the existence of two EST's 
30 (Genbank Accession Nos. AWl 50602 and AI923224) in the genebank database, both of which 
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confirmed the stop-codon indicated (TGA) as well as the poly A signal sequence (AATAA) and 
the poly A tail (see Table 9). The presence of these repeats has been confirmed in serous ovarian 
tumors and their absence in normal ovarian tissue and mucinous tumors as expected (see Figure 
2A). Also, the transcripts for these repeats have been shown to be present in tumor cell lines 
5 derived from ovarian tumors, but not in normal lymphocyte cell lines (Figure 2B). Moreover, 
Northern blot analysis of mRNA derived from normal or ovarian carcinoma and probed with a 
P^^ labeled CA125 repeat sequence (as shown in Figure 6) confirmed the presence of an RNA 
transcript in excess of 20 kb in ovarian tumor extracts (see Figure 2B). 

To date, 45 repeat sequences have been identified with high homology to each other. To 
10 order these repeat units, overlapping sequences were amphfied using a sense primer (5 ' GTC TCT 
ATG TCA ATG GTT TCA CCC-3') (SEQ ID NO: 305) from an upstream repeat and an antisense 
primer from a downstream repeat sequence (antisense 5' TAG CTG CTC TCT GTC CAG TCC-3') 
"3 (SEQ ID NO: 306). Attempts have been made to place these repeats in a contiguous fashion as 

shown in Figure 3. There is some potential redundancy. Further, there is evidence from overlapping 
l|l sequences that some repeats exist in more than one location in the sequence giving a total of more 

than 60 repeats in the CA125 molecule (see Table 21 SEQ ID NO: 162). 
CO Fmal confirmation of the relationship of tiie putative C A 1 25 repeat domain to the known 

U CA125 molecule was achieved by expressing a recombinant repeat domain in E. coli. In Figure 4, 

expression of a recombinant CAl 25 repeat domain is shown m lane 2 compared to the vector alone in 
2Qj lane 1 , Panel D. A series of Westem blots representing E. coli extracts of vector alone in lane 1 ; 
Tl CAl 25 recombinant protein lane in 2 and recombinant TADG- 1 4 (an unrelated recombinant 
protease), lane 3, were probed with the CA125 antibodies Mil, Panel A; OC125, Panel B; and 
ISOBM 9.2, Panel C. In all cases, CA125 antibodies recognized only the recombinant CA125 
antigen (lane 2 of each panel). 
25 To further characterize the epitope location of the CA125 antibodies, recombinant CA125 

repeat was digested with the endoprotease Lys-C and separately with the protease Asp-N. In both 
cases, epitope recognition was destroyed. As shown in Figure 5, the initial cleavage site for ASP-N 
is at amino acid #76 (indicated by arrow in Figure 5C). This sequence (amino acids # 1-76), a 17 
kDa band, was detected with anti-histidine antibodies (Figure 5A,Lane 3) and found to have no 
30 capacity to bind CA125 antibodies (Figure 5B, Lane 3). The upper bands in Figures 5A and 5B 
represent the undigested remaining portion of the CA125 recombinant repeat. From these data, one 
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can reasonably conclude that epitopes are either located at the site of cleavage and are destroyed by 
Asp-N or are downstream from this site and also destroyed by cleavage. Likewise, cleavage with 
Lys-C would result in a peptide, which includes amino acids # 68-154 (Figure 5C) and again, no 
antibody binding was detected. In view of the foregoing, it seems likely that epitope binding resides 
5 in the cysteine loop region containing a possible disulfide bridge (amino acids # 59-79). Final 
confirmation of epitope sites are being examined by mutating individual amino acids. 

To determine transcript size of the CA125 molecule, Northern blot analysis was performed on 
niRNA extracts from both normal and tumor tissues. In agreement with the notion that CA125 may 
be represented by an unusually large transcript due to its known mega dalton size in tumor sera, 

10 ascites fluid, and peritoneal fluid [Nustad K et al, CA125 - epitopes and molecular size, Int, J of 
Biolog. Markers, 13(4)196-199 (1998)], a transcript was discovered which barely entered the gel 
from the holding well (Figure 6). CA125 nxRNA was only present in the tumor RNA sample and 
O while a precise designation of its true size remains difficult due to the lack of appropriate standards, 
its unusually large size would accommodate a protein core structure in excess of 1 1,000 amino acids. 

if: Evidence demonstrates that the repeat domain of the CA125 molecule encompasses a 

minimum of 45 different 156 amino acid repeat units and possibly greater than 60 repeats, as 
% individual repeats occur more than once in the sequence. This finding may well account for the 

extraordinary size of the observed transcript. The amino acid composition of the repeat units (Figure 
^ P 7 A, 7C, Table 21) indicates that the sequence is rich in serine, threonine, and proline typical of the 
high STP repeat regions of the mucin genes [Gum Jr., JR, Mucin genes and the proteins they encode: 

y Structure, diversity and regulation, Am JRespir. CellMol BioL 7:557-564 (1992)]. Results suggest 
that the downstream end of the repeat is heavily glycosylated. 

Also noteworthy is a totally conserved methionine at position 24 of the repeat (Figure 7A, 
7C), It is this methionine which allowed cyanogen bromide digestion of the CA125 molecule, 

25 resulting in the 40 kDa glycopeptide that was identified with OC125 and Ml 1 antibodies in Western 
blots of the CNBr digested peptides. These data predict that the epitopes for the CA125 antibodies 
are located in the repeat sequence. By production of a recombinant product representing the repeat 
sequence, results have confirmed this to be true. A potential disulfide bond is noted, which would 
encompass a C-enclosure comprising 19 amino acids enclosed by two cysteines at positions #59 and 

30 #79. The cysteines are totally conserved, which suggest a biological role for the resulting putative C- 
enclosure in each repeat. As mentioned above, it is likely that the OC125 and Mil epitopes are 
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located in the C-enclosure, indicating its relative availability for immune detection. This is probably 
due to the C-enclosure structure and the paucity of glycosylation in the immediate surrounding areas. 
Domain searches also suggest some homology in the repeat domain to an SEA domain commonly 
found in the mucin genes [Williams SJ et al, MUC13, a novel human cell surface mucin expressed 
by epitheUal and hemopoietic cells, J of Biol. Chem 276(21)18327-18336 (2001)] beginning at amino 
acid #1 and ending at #131 of each repeat. No biological function has been described for this 
domain. 

Based on homology of the repeat sequences to chromosome 19q 13.2 (cosmid #AC008734) 
and confmned by genomic amplification, it has been estabUshed that each repeat is comprised of 5 
exons (covering approximately 1900 bases of genomic DNA): exon 1 comprises 42 amino acids (#1- 
42); exon 2 comprises 23 amino acids (#43-65); exon 3 comprises 58 amino acids (#66-123); exon 4 
comprises 12 amino acids (#124-135); and exon 5 comprises 21amino acids (#136-156) (see Figure 
7B). Homology pile-ups of individual exons have also been completed (see Figure 7C), which 
indicates that exon 1 has a minimum of 31 different copies of the exon; exon 2 has 27 copies; exon 3 
has 28 copies, exon 4 has 28 copies and exon 5 has 21 copies. If all exons were only found in a 
single configuration relative to each other, one could determine that a minimum number of repeats of 
3 1 were present in the CA125 molecule. Using the exon 2 pile-up data as an example, it has been 
estabUshed as mentioned above that there are 27 individual exon 2 sequences. Using exon 2, which 
was sequenced fully in both the repeat units and the overlaps, results estabUshed that a minimum of 
45 repeat units are present when exon 2 is combined with unique other exon combinations. However, 
based on overlap sequence information, 60+ repeat units are likely present in the CA125 molecule 
(Table 21). This larger number of repeat units can be accounted for by the presence of the same 
repeat unit occurring in more than one location. 

Currently, the repetitive units of the repeat domain of the CA125 molecule constitute the 
majority of its extracellular molecular structure. These sequences have been presented in a tandem 
fashion based on overlap sequencing data. Some sequences may be incorrectly placed and some 
repeat units may not as yet be identified (Table 21). More recently, an additional repeat was 
identified in CA125 as shown in Tables 22 and 23 (SEQ. ID NOS: 307 and 308). The exact position 
has not yet been identified. Also, there is a potential that alternate spUcing and/or mutation could 
account for some of the repeat variants that are listed. Studies are being conducted to compare both 
normal tissue derived CA125 repeats to individual tumor derived CA125 repeats to determine if such 
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variation is present. Currently, the known exon configurations would easily accommodate the greater 
than 60 repeat units as projected. It is, therefore, unlikely that alternate splicing is a major 
contributor to the repetitive sequences in CA125. It should also be noted that the genomic database 
for chromosome 19q 13.2 only includes about 10 repeat units, thus indicating a discrepancy between 

5 the data of the present invention (more than 60 repeats) and the genomic database. A recent 

evaluation of the methods used for selection and assembly for genomic sequence [Marshall E, DNA 
Sequencing: Genome teams adjust to shotgum marriage, Science 292:1982-1983 (2001)] reports that 
"more research is needed on repeat blocks of almost identical DNA sequence which are more 
common in the human genome. Existing assembly programs can't handle them well and often delete 
10 them." The CA125 repeat units located on chromosome 19 may well be victims of deletion in the 
genomic database, thus accounting for most CA125 repeat units absent from the current databases. 
A. Sequence Confirmation and Assembly of the Amino Terminal Domain (Domain 1) of tlie 

i CA125 Molecule 

As previously mentioned, homology for repeat sequences was found in the chromosome 19 
Vfl cosmid AC008734 of the GCG database. This cosmid at the time consisted of 35 unordered contigs. 

y After searching the cosmid for repeat sequences, contig #32 was found to have exons 1 and 2 of a 

fo 

repeat unit at its 3' end. Contig #32 also had a large open reading frame upstream from the two 
repeat units, which suggested that this contig contained sequences consistent with the amino terminal 
fy end of the CA125 molecule. A sense primer was synthesized to the upstream non-repeat part of 

tM contig #32 coupled with a specific primer from within the repeat region (see Methods). PCR 

amphfication of ovarian tumor cDNA confirmed the contiguous positioning of these two domains. 

The PCR reaction yielded a band of approximately 980bp. The band was sequenced and 
foxmd to connect the upstream open reading frame to the repeat region of CA125, From these data, 
more primer sets (see Methods) were synthesized and used in PCR reactions to piece together the 

25 entire open reading frame contained in contig #32. To find the 5' most end of the sequence, an EST 
(AUl 33673) was discovered, which linked contig #32 to contig #7 of the same cosmid. Specific 
primers were synthesized, (5'-CTGATGGCATTATGGAACACATCAC-3' (SEQ ID NO: 59) and 
5'-CCCAGAACGAGAGACCAGTGAG-3' (SEQ ID NO: 60)), to the EST and contig #32. A PCR 
reaction was performed to confirm that part of the EST sequence was in fact contiguous with contig 

30 #32, Confirmation of this contiguous 5' prime sequencing strategy using overlapping sequences 
allowed the assembly of the 5' region (Domain 1) (Figure 8A). 5' RACE PCR was performed on 
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tumor cDNA to confimi the amino terminal sequence to CA125. The test confirmed the presence of 
contig #7 sequence at the amino terminal end of CA125. 

The amino terminal domain comprises five genomic exons covering approximately 13,250 bp. 
Exon 1, a small exon, (amino acids #1-33) is derived from contig #7 (Figure 8A). The remaining 
exons are all derived from contig #32: Exon 2 (amino acids #34-1593), an extraordinarily large exon, 
Exon 3 (amino acids #1594-1605), Exon 4 (amino acids #1606-1617) and Exon 5 (amino acids 
#1618-1637) (see Figure 8A). 

Potential N-glycosylation sites marked (x) are encoded at positions #81, #271, #320, #624, 
#795, #834, #938, and #1,165 (see Figure 8B). 0-glycosylation sites are exfraordinarily abundant 
and essentially cover the amino terminal domain (Figure 8B). As shown by the 0-glycosylation 
pattern, Domain 1 is highly enriched in both threonine and serine (Figure SB). 
B. Sequence Confirmation and Assembly of the CA125 Carboxy Terminal End (Domain 3) 

A search of Genbank using the repeat sequences described above uncovered a cDNA 
sequence referred to as Genbank accession number AK024365. This sequence was found to have 2 
repeat sequences, which overlapped 2 known repeat sequences of a series of 6 repeats. As a result, 
the cDNA allowed the ahgnment of all six carboxy terminal repeats along with a unique carboxy 
terminal sequence. The carboxy terminus was ftirther confirmed by the existence of two other ESTs 
(Genbank accession numbers AWl 50602 and A1923224), both of which confirmed a stop codon as 
well as a poly-A signal sequence and a poly-A tail (see GCG database #AF414442). The sequence of 
the carboxy terminal domain was confirmed using primers designed to sequence just downstream of 
the repeat domain (sense primer 5' GGA CAA GGT CAC CAC ACT CTA C-3') (SEQ ID NO: 303) 
and an antisense primer (5'-GCA GAT CCT CCA GGT CTA GGT GTG-3') (SEQ ID NO: 304) 
designed to carboxy terminus (Figure 9 A). 

The carboxy terminal domain covers more than 14,000 genomic bp. By Ugation, this domain 
comprises nine exons as shown in Figure 9A. The carboxy-terminus is defined by a 284 amino acid 
sequence downstream from the repeat domains (see Figure 9B). Both N-glycosylation sites marked 
(x) (#31, #64, #103, #140, #194, #200) and a small number of O-glycosylation sites marked (o) are 
predicted for the carboxy end of the molecule (Figures 9 A, 9B). Of special note is a putative 
transmembrane domain at positions #230-#252 followed by a cytoplasmic domain, which is 
characterized by a highly basic sequence adjacent to the membrane (#256-#260) as well as several 
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potential S/T phosphorylation sites (#254, #255, #276) and tyrosine phosphorylation sites (at # 264, 
#273, #274) (Figures 9A, 9B). 

Assembly of the CA125 molecule as validated by PCR ampUfication of overlap sequence 
provides a picture of the whole molecule (see Figure 10 and Table 21). The complete nucleotide 
5 sequence is available in Genebank, Accession #AF4 14442 and the amino acid sequence as currently 
ahgned is shown in Table 21. 

DISCUSSION 

The CA125 molecule comprises three major domains; an extracellular amino terminal domain 
(Domain 1), a large multiple repeat domain (Domain 2) and a carboxy terminal domain (Domain 3), 
10 which includes a transmembrane anchor with a short cytoplasmic domain (Figure 1 0). The amino 
terminal domain is assembled by combining five genomic exons, four very short amino terminal 
sequences and one extraordinarily large exon, which often typifies mucin extracellular glycosylated 
f;3 domains [Desseyn JL et al. , Human mucin gene MUC5B, the 1 0.7-kb large central exon encodes 
■ n various alternate subdomains resulting in a super-repeat. Structural evidence for a 1 Ip 1 5 . 5 gene 
ij family, J. Biol. Chem. 272(6):3168-3178 (1997)]. This domain is dominated by its capacity for O- 
%J glycosylation and its resultant richness in serine and threonine residues. Overall, the potential for 0- 
m glycosylation essentially covers this domain and, as such, may allow the carbohydrate superstructure 
to influence ECM interaction at this end of the CA125 molecule (Figure 8). There is one short area 
2 (amino acids # 74- 1 20) where little or no glycosylation is predicted, which could allow for protein- 
i{l protein interaction in the extracellular matrix. 

«« Efforts to purify CA125 over the years were obviously complicated by the presence of this 

amino terminal domain, which is unlikely to have any epitope sites recognized by the OC125 or Ml 1 
class antibodies. As the CA125 molecule is degraded in vivo, it is likely that this highly glycosylated 
amino terminal end will be found associated with varying numbers of repeat units. This could very 

25 well account for both the charge and size heterogeneity of the CAl 25 molecule so often identified 
from serum and ascites fluid. Also of note are two T-TALK sequences at amino acids # 45-58 
(underlined in Figure 8B), which are unique to the CA125 molecule. 

The extracellular repeat domain, which characterizes the CA125 molecule, also represents a 
major portion of the molecular structure. It is downstream from the amino terminal domain and 

30 presents itself in a much different manner to its extracellular matrix neighbors. These repeats are 
characterized by many features including a highly-conserved nature (Figure 3) and a uniformity in 

26 



exon structure (Figure 7). But most consistently, a cysteine enclosed sequence may form a cysteine 
loop (Table 21). This structure may provide extraordinary potential for interaction with neighboring 
matrix molecules. Domain 2 encompasses the 156 amino acid repeat units of the CA125 molecule. 
The repeat domain constitutes the largest proportion of the CA125 molecule (Table 21 and Figure 
5 10). Because it has been known for more than 1 5 years that antibodies buid in a multivalent fashion 
to CA125, it has been predicted that the CA125 molecule would include multiple repeat domains 
capable of binding the OC125 and Ml 1 class of sentinel antibodies which define this molecule 
[O'Brien et al. New monoclonal antibodies identify the glycoprotein carrying the CA125 epitope, 
AmJObstet Gynecol. 165:1857-1964 (1991); Nustad K et al. Specificity and affinity of 26 
10 monoclonal antibodies against the CA125 antigen: First report fi-om the ISOBM TD-1 workshop. 
Tumor Biology 17:196-219 (1996); and Bast RC et al, A radioimmunoassay using a monoclonal 
antibody to monitor the course of epithelial ovarian cancer, A^. Engl J. Med. 309:883-887 (1983)]. In 
p the present invention, more than 60 repeat units have been identified, which are in tandem array in 
% the extracellular portion of the CA125 molecule. Individual repeat units have been confirmed by 
if sequencing and further identified by PGR ampHfication of the overlappuig repeat sequences. Results 
\4 confirm the contiguous placement of most repeats relative to its neighbor (Table 21). 
JjJ Initial evidence suggests that this area is a potential site for antibody binding and also for 

s hgand binding. The highly conserved methionine and several highly conserved sequences within the 
3 repeat domain also suggests a fimctional capacity for these repeat units. The extensive glycosylation 
2^ of exons 4 & 5 of the repeat unit and the N-glycosylation potential in exon 1 and the 5 ' end of exon 2 
i:3 might further point to a functional capacity for the latter part of exon 2 and exon 3 which includes the 
C-enclosure (see Figure 7). It should be apparent that the C-enclosure might be a prime target for 
protease activity and such cleavage may well explain the difficulty experienced by many 
investigators in obtaining an undigested CA125 parent molecule. Such activity might explain the 
25 diffuse pattern of antibody binding and the loss of antibody binding for molecules of less than 

200,000 kDa. Proteolysis would destiroy the epitopes and, therefore, only multiple repeats could be 
identified by blotting with CA125 antibodies. The repeat unit organization also suggests the potential 
for a multivalent interaction with extracellular entities. 

The carboxy terminal domain of the CA125 molecule comprises an extracellular domain, 
30 which does not have any homology to other known domains. It encodes a typical transmembrane 
domain and a short cytoplasmic tail. It also contains a proteolytic cleavage site approximately 50 
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amino acids upstream from the transmembrane domain. This would allow for proteolytic cleavage 
and release of the CA125 molecule (Figure 9). As indicated by Fendrick, et al. [CA125 
phosphorylation is associated with its secretion from the WISH human amnion cell hne, Tumor 
Biology 18:278-289 (1997)], release of the CA125 molecule is preceded by phosphorylation and 

5 sustained by inhibitors of phosphatases, especially inhibition of phosphatase 2B . The cytoplasmic 
tail which contains S/T phosphorylation sites next to the transmembrane domain and tyrosine 
phosphorylation sites downsfream from there could accommodate such phosphorylation. A very 
distinguishable positively charged sequence is present upstream from the tyrosine, suggesting a signal 
transduction system involving negatively charged phosphate groups and positively charged lysine and 

10 arginine groups. 

These featxu-es of the CA125 molecule suggest a signal transduction pathway involvement in 
the biological function of CA125 [Fendrick JL et al, CA125 phosphorylation is associated with its 
□ secretion from the WISH human amnion cell line, Tumor Biology 1 8:278-289 (1 997); and Konish I et 

al. , Epidermal growth factor enhances secretion of the ovarian tumor-associated cancer antigen 
iP CA125 from the human amnion WISH cell hne, JSoc. Gynecol. Invest. 1 :89-96 (1994)]. It also 
si reinforces the prediction of phosphorylation prior to CA125 release from the membrane surface as 
y previously proposed [Fendrick JL etal., CA125 phosphorylation is associated with its secretion from 
I the WISH human amnion cell line. Tumor Biology 1 8:278-289 (1997); and Konish I etal.. Epidermal 
' fi growth factor enhances secretion of the ovarian tumor-associated cancer antigen CA125 from the 
^! human amnion WISH cell Ime, JSoc. Gynecol. Invest. 1 :89-96 (1994)]. Furthermore, a putative 
Q proteolytic cleavage site on the extra-cellular side of the transmembrane domain is present at position 
#176-181. 

How well does the CA125 structure described in the present invention compare to the 
previously known CA125 structure? O'Brien et al reported that a number of questions needed to be 

25 addressed: 1) the multivalent nature of the molecule; 2) the heterogeneity of CA125; 3) the 

carbohydrate composition; 4) the secretory or membrane bound nature of the CA125 molecule; 5) the 
function of the CA125 molecule; and 6) the elusive CA125 gene [More than 15 years of CA125: 
What is known about the antigen, its structure and its function, Int J Biological Markers 13(4)188- 
195 (1 998)]. Several of these questions have been addressed in the present invention including, of 

30 course, the gene and its protein core product. Perhaps, most interestingly is the question of whether 
an individual large transcript accounted for the whole CA125 molecule, or a number of smaller 
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transcripts which represented subunits that specifically associated to produce the CA125 molecule. 
From the results produced by way of the present invention, it is now apparent that the transcript of 
CA125 is large - similar to some of the mucin gene transcripts e.g. MUC 5B [see Verma M et at. 
Mucin genes: Structure, expression and regulation, Glycoconjugate J. 11:172-179 (1994); and 
Gendler SJ et al. Epithelial mucin genes, Annu. Rev. Physiol. 57:607-634 (1995)]. The protein core 
extracellular domains all have a high capacity for 0-glycosylation and, therefore, probably accounts 
for the heterogeneity of charge and size encountered in the isolation of CA125. The data also 
confirm the 0-glycosylation inhibition data, indicating CA125 to be rich in 0-glycosylation [Lloyd 
KO et al.. Synthesis and secretion of the ovarian cancer antigen CA125 by the human cancer cell Une 
NIH: OVCAR-3, Tumor Biology 22, 77-82 (2001); Lloyd KO et al.. Isolation and characterization of 
ovarian cancer antigen CA125 using a new monoclonal antibody (VK-8): Identification as a mucin- 
type molecule, Int. J. Cancer, 71 :842-850 (1997); and Fendrick JL et al. Characterization of CA125 
synthesized by the human epithelial amnion WISH cell line, Tumor Biology 14:310-318 (1993)]. 

The repeat domain which includes more than 60 repeat units accounts for the multivalent 
nature of the epitopes present, as each repeat unit likely contains epitope binding sites for both 
OC125-like antibodies and Ml Mike antibodies. The presence of a transmembrane domain and 
cleavage site confirms the membrane association of CA125, and reinforces the data which indicates a 
dependence of CA125 release on proteolysis. Also, the release of CA125 from the cell surface may 
well depend on cytoplasmic phosphorylation and be the resuh of EGF signaling [Nustad K et al. 
Specificity and affinity of 26 monoclonal antibodies against the CA125 antigen: First report from the 
ISOBM TD-1 workshop. Tumor Biology 17:196-219 (1996)]. As for the question of inherent 
capacity of CA125 for proteolytic activity, this does not appear to be the case. However, it is likely 
that the associated proteins isolated along with CA125 (e.g. the 50 kDa protein which has no 
antibody binding ability) may have proteolytic activity. In any case, proteolysis of an extracellular 
cleavage site is the most likely mechanism of CA125 release. Such cleavage would be responsive to 
cytoplasmic signaUng and mediated by an associated exfracellular protease activity. 

In summary, the large number of tandem repeats of the CA125 molecule, which dominate its 
molecular structure and contain the likely epitope binding sites of the CA125 molecule, was 
unexpected. Also, one cannot as yet account for the proteolytic activity, which has plagued the 
isolation and characterization of this molecule for many years. While no protease domain per se is 
constituitively part of the CA125 molecule, there is a high likelihood of a direct association by an 
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extracellular protease with the ligand binding domains of the CA125 molecule. Finally, what is the 
role of the dominant repeat domain of this extracellular structure? Based on the expression data of 
CA125 on epithehal surfaces and in glandular ducts^ it is reasonable to conclude that the unique 
structure of these repeat units with their cysteine loops plays a role both as glandular anti-invasive 
5 molecules (bacterial entrapment) and/or a role in anti-adhesion (maintaining patency) between 
epithelial surfaces and in ductal linings. 

Recently, Yin and Lloyd described the partial cloning of the CA125 antigen using a 
completely different approach to that described in the present invention [Yin TWT et al. Molecular 
cloning of the CA125 ovarian cancer antigen. Identification as a new mucin (MUC16), J Biol. Chem. 

10 276;27371-27375 (2001)]. Utilizing a polyclonal antibody to CA125 to screen an expression library 
of the ovarian tumor cell line OVCAR-3, these researchers identified a 5965 bp clone containing a 
stop codon and a poly A tail, which included nine partially conserved tandem repeats followed by a 

13 potential transmembrane region with a cytoplasmic tail. The 5965 bp sequence is almost completely 
homologous to the carboxy terminus region shown in Table 21 . Although differing in a few bases, 

lij the sequences are homologous. As mentioned above, the c5^oplasmic tail has the potential for 

y I 

Si phosphorylation and a transmembrane domain would anchor this part of the CA125 molecule to the 
surface of the epithelial or tumor cell. In the extracellular matrix, a relatively short transition domain 
connects the transmembrane anchor to a series of tandem repeats - in the case of Yin and Lloyd, nine. 
i^n By contrast, the major extracellular part of the molecule of the present mvention as shown is 

2S upstream from the sequence described by Yin and includes a large series of tandem repeats. These 
S results, of course, provide a different picture of the CA125 molecule, which suggest that CA125 is 
dominated by the series of extracellular repeats. Also included is a major amino terminal domain 
(-1638 amino acids) for the CA125 molecule, which it is believed accounts for a great deal of the O- 
glycosylation known to be an important structural component of CA125. 
25 In conclusion, a CA125 molecule is disclosed which requires a transcript of more than 35,000 

bases and occupies approximately 150,000 bp on chromosome 19q 13.2, It is dominated by a large 
series of extracellular repeat units (156 amino acids), which offer the potential for molecular 
interactions especially through a highly conserved unique cysteine loop. The repeat units also 
include the epitopes now well-described and classified for both the major class of CA125 antibodies 
30 (i.e., the OC125 and the Ml 1 groups). The CA125 molecule is anchored at its carboxy terminal 
through a transmembrane domain and a short cytoplasmic tail. CA125 also contains a highly 
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glycosylated amino terminal domain, which includes a large extracellular exon typical of some 
mucins. Given the massive repeat domain presence of both epitheUal surfaces and ovarian tumor cell 
surfaces, it might be anticipated that CA125 may play a major role in determining the extracellular - 
environment surrounding epithelial and tumor cells. 
Advantages and Uses of the CA125 Recombinant Products 

1) Current assays to CA125 utilize as standards either CA125 produced from cultured cell 
lines or from patient ascites fluid. Neither source is defined with regard to the quality or purity of 
the CA125 molecule. Therefore arbitrary units are used to describe patient levels of CA125. 
Because cut-off values are important in the treatment of patients with elevated CA125 and 
because many different assay systems are used clinically to measure CA125, it is relevant and 
indeed necessary to define a standard for all CA125 assays. Recombinant CA125 containing 
epitope binding sites could fulfill this need for standardization. Furthermore, new and more 
specific assays may be developed utilizing recombinant products for antibody production. 

2) Vaccines: Adequate data now exists [see Wagner U et al, Immunological 
consoHdation of ovarian carcinoma recurrences with monoclonal anti-idiotype antibody 
ACA125: Immune responses and survival in paUiative treatment, Clin. Cancer Res. 7:1112-1115 
(2001)], which suggest and support the idea that CA125 could be used as a therapeutic vaccine to 
treat patients with ovarian carcinoma. Heretofore, in order to induce cellular and humoral 
immunity in humans to CA125, murine antibodies specific for CA125 were utilized in 
anticipation of patient production of anti-ideotypic antibodies, thus indirectly allowing the 
induction of an immune response to the CA125 molecule. With the availability of recombinant 
CA125, especially domains which encompass epitope binding sites for known murine antibodies 
and domains directly anchoring CA125 on the tumor cell, it will be feasible to more directly 
stimulate patients' immune systems to CA125 and as a result, extend the life of ovarian 
carcinoma patients as demonstrated by Wagner et al. 

Several approaches can be utilized to achieve such a therapeutic response in the immune 
system by: 1) directly immunizing the patient with recombinant antigen containing the CA125 
epitopes or other domains; 2) harvesting dendritic cells from the patient; 3) expanding these cells 
in in vitro culture; 4) activating the dendritic cells with the recombinant CA125 epitope domain 
or other domains or with peptides derived from these domains [see Santin AD et al. Induction of 
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ovarian tumor-specific CD8+ cytotoxic T lymphocytes by acid-eluted peptide-pulsed autologous 
dendritic cells, Obstetrics & Gynecology 96(3):422-430 (2000)]; and then 5) returning these 
immune stem cells to the patient to achieve an immune response to CA125. This procedure can 
also be accomplished using specific peptides v^hich are compatible with histocompatibility 
5 antigens of the patient. Such peptides compatible with the HLA-A2 binding motifs common in 
the population are indicated in Figure 12. 

3) Therapeutic Targets: Molecules, which are expressed on the surface of tumor cells as 
CA125 is, offer potential targets for immune stimulation, drug delivery, biological modifier 
delivery or any agent which can be specifically delivered to ultimately kill the tumor cells. 

10 CA125 offers such potential as a target: 1) Antibodies to CA125 epitopes or newly described 
potential epitopes: Most especially humanized or human antibodies to CA125 which could 
directly activate the patients' immune system to attack and kill tumor cells. Antibodies could be 
:,p used to deliver all drug or toxic agents including radioactive agents to mediate direct killing of 
2S tumor cells. 2) Natural ligands: Under normal circumstances, molecules are bound to the CA125 

molecule e.g. a 50 k dalton protein which does not contain CA125 epitopes co-purifies with 
Ly CA125. Such a molecule, which might have a natural binding affinity for domains on the CA125 
"J molecule, could also be utilized to deliver therapeutic agents to tumor cells. 

4) Anti-sense therapy: CA125 expression may provide a survival or metastatic advantage 
fU to ovarian tumor cells as such antisense oHgonucleotide derived fi-om the CA125 sequence could 

2(|3 be used to down-regulate the expression of CA125. Antisense therapy could be used in 
' association with a tumor cell delivery system such as described above. 

5) Small Molecules: Recombinant domains of CA125 also offer the potential to identify 
small molecules which bind to individual domains of the molecule. Small molecules either from 
combinatorial chemical libraries or small peptides can also be used as delivery agents or as 

25 biological modifiers. 

All references referred to herein are hereby incorporated by reference in their entirety. 
It should be understood that various changes and modifications to the presently preferred 
embodiments described herein will be apparent to those skilled in the art. Such changes and 
modifications can be made without departing from the spirit and scope of the present invention 
30 and without diminishing its attendant advantages. 
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TABLE 1 



Comparison of the Amino Acid Terminal Sequences and Several Internal Sequences 
for the 40kD Band for CA125 glycoprotein (SEQ ID NO: 1 through SEQ ID NO: 4) to 
the Nucleotide and Amino Acid Sequences for EST Genbank Accession No. AA640762 

(SEQ ID NO: 5 and SEQ ID NO: 6, respectively) 



40kDa Nterm - |QHPGSRKFKTTEG| (SEQ ID NO: 1) 



Peak 68 -£J^ggyU^L (SEQ ID NO: 2) 

Peak 65 - ^^GP]^ (SEQ ID NO: 3) 

Peak 30 - DGAANGVD (SEQ ID NO: 4) 



(SEQ ID NO: 5 and SEQ ID NO: 6) 



1 C GTCG ACCTGGCTCTAGAAAGTTTAACACCACGGA GAGAGT CC T TC AGGGTC T GC T CAGG 
R R IPGSRKFN TT E I R V L Q G L L R 

61 CCTGT GT T CAAGAACACCAGTGT TGGCCCTCT GT ACT CTGGCT GCAGACT GACCT T GCT C 
P V F K NT SVGPLY SGCRLTLL 

I 

12 1 AGGCCCAAGAAGGATGGGGCAGCCACCAAAGT GGATGCCAT CT GCACCT ACCGCCCTGAT 
P K K ^ G ^ ^ T K ^DAICTYRPD 

181 CCCAAAAGCCCTGGACT GGACAGAGAGCAGCT AT ACT GGGAGCT GAGCCAGGGTGATGCA 
PKS PGLDREQLYWELSQGDA 
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TABLE 2A 



Nucleotide and Amino Acid Sequences for Sense Primer 5' 3' (SEQ ID NO: 7 and 
SEQ ID NO: 8 respectively) and Antisense Primer 5' 3' 
(SEQ ID NO: 9 and SEQ ID NO: 10 respectively) based upon Regions of Homology for 
EST Genbank Accession Nos. BEO 05912 and AA64 07 62) 

GGA GAG GGT TCT GCA GGG TC (SEQ ID NO: 7) 

E R V L Q G (SEQ ID NO: 8) 

GTG AAT GGT ATC AGG AGA GG (SEQ ID NO: 9) 

P L L I P F (SEQ ID NO: 10) 



TABLE 2B 



Sense and Anti-Sense Primers Used for Ordering Repeat Units 
(SEQ ID NO: 301 and SEQ ID NO: 302, respectively) 



5'-GTCTCTATGTCAATGGTTTCACCC-3' (SEQ ID NO: 301) 

5'-TAGCTGCTCTCTGTCCAGTCC-3' (SEQ ID NO: 302) 
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TABLE 3 



Amino Acid Sequence for a 400 bp Repeat in the CA125 Molecule 
(SEQ ID NO: 11 thru SEQ ID NO: 21) 



1 50 

12 ERVLQGLLRS LFKSTSVGPL YSGCRLTLLR PEKDGTATGV DAICTHHPDP 

10 3 4 ERVLQGLLMP LFKNTSVSSL YSGCRLTLLR PEKDGAATRA DAVCTHRPDP 

32 ERVLQGLLGP IFKNTSVGPL YSGCRLTSLR SEKDGAATGV DAICIHRLDP 
4 6 ERVLQGLLGP MFKNTSVGLL YSGCRLTLLR PEKNGAATGM DAICSHRLDP 

33 ERVLQGLLGP LFKNSSVGPL YSGCRLISLR SEKDGAATGV DAICTHHLNP 
15 ERVLQGLLRP LFKSTSAGPL YSGCRLTLLR PEKHGAATGV DAICTLRLDP 

15 3 5 ERVLQGLLKP LFKSTSVGPL YSGCRLTLLR PEKRGAATGV DTICTHRLDP 

111 ERVLQGLLTP LFKNTSVGPL YSGCRLTLLR PEKQEAATGV DTICTHRVDP 

42 ERVLQGLLKP LFKNTSVGPL YSGCRLTLLR PEKHEAATGV DTICTHRLDP 

116 ERVLQGLLSP IFKNSSVGPL YSGCRLTSLR PEKDGAATGM DAVCLYHPNP 

2 3 ERVLQGLLRP LFKNTSIGPL YSSCRLTLLR PEKDKAATRV DAICTHHPDP 

20 

51 100 

12 KSPRLDREQL YWELSQLTHN ITELGPYALD NDSLFVNGFT HRSSVSTTST 

3 4 KSPGLDRERL YWKLSQLTHG ITELGPYTLD RHSLYVNGFT HQSSMTTTRT 
O 3 2 KSPGLNREQL YWELSKLTND lEELGPYTLD RNSLYVNGFT HQSSVSTTST 

2511 4 6 KSPGLNREQL YWELSQLTHG IKELGPYTLD RNSLYVNGFT HRSSVAPTST 

3 3 QSPGLDREQL YWQLSQMTNG IKELGPYTLD RNSLYVNGFT HRSSGLTTST 

pi 15 TGPGLDRERL YWELSQLTNS VTELGPYTLD RDSLYVNGFT HRSSVPTTSI 

[fj 35 LNPGLDREQL YWELSKLTRG IIELGPYTLD RDSLYVNGFT HRSSVPTTSI 

rj 111 IGPGLDRERL YWELSQLTNS ITELGPYTLD RDSLYVDGFN PWSSVPTTST 

30^1 42 LNPGLDREQL YWELSKLTRG IIELGPYLLD RGSLYVNGFT HRNFVPITST 

116 KRPGLDREQL YWELSQLTHN ITELGPYSLD RDSLYVNGFT HQNSVPTTST 

2 3 QSPGLNREQL YWELSQLTHG ITELGPYTLD RDSLYVDGFT HWSPIPTTST 

150 







101 












12 


PGTPTVYLGA 


SKTPASIFGP 


S. 


.AASPLLI 


PFT 


: '^9 


34 


PDTSTMHLAT 


SRTPASLSGP 


T. 


.TASPLLI 






32 


PGTSTVDLRT 


SGTPSSLSSP 


TIMAAGPLLI 






46 


PGTSTVDLGT 


SGTPSSLPSP 


T. 


.TAVPLLI 






33 


PWTSTVDLGT 


SGTPSPVPSP 


T. 


.TAGPFLI 






15 


PGTSAVHLET 


SGTPASLPGH 


T. 


.APGPLLI 






35 


PGTSAVHLET 


SGTPASLPGH 


I. 


,VPGPLLI 






111 


PGTSTVHLAT 


SGTPSPLPGH 


T. 


.APVPLLI 


PFT- 




42 


PGTSTVHLGT 


SETPSSLPRP 


I. 


. VPGPLLV 


PFT- 


45 


116 


PGTSTVYWAT 


TGTPSSFPGH 


T. 


. EPGPLLI 




23 


PGTSIVNLGT 


SGIPPSLPET 


T. 


.ATGPLLI 


PFT- 



(SEQ 


ID 


NO: 


11) 


(SEQ 


ID 


NO: 


12) 


(SEQ 


ID 


NO: 


13) 


(SEQ 


ID 


NO: 


14) 


(SEQ 


ID 


NO: 


15) 


(SEQ 


ID 


NO: 


16) 


(SEQ 


ID 


NO: 


17) 


(SEQ 


ID 


NO: 


18) 


(SEQ 


ID 


NO: 


19) 


(SEQ 


ID 


NO: 


20) 


(SEQ 


ID 


NO: 


21) 
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TABLE 3 -continued 



Amino Acid Sequence for a 400 bp Repeat in the CA125 Molecule 
(SEQ ID NO: 11 thru SEQ ID NO: 21) 



151 170 
12 

34 

32 

46 

33 

15 

35 

111 

42 

116 

23 
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TABLE 4 



Amino Acid Sequence for a 80 0 bp Repeat in the CA125 Molecule 
(SEQ ID NO: 22 thru SEQ ID NO: 35) 



1 

79 ERVLQGLLKP LFRNSSLEYL YSGCRLASLR 

811 ERVLQGLLKP LFRNSSLEYL YSGCRLASLR 
21 ERVLQGLLKP LFKSTSVGPL YSGCRLTLLR 
8 9 ERVLQGLLKP LFKSTSVGPL YSGCRLTLLR 

85 ERVLQGLLKP LFKSTSVGPL YSGCRLTLLR 
712 ERVLQGLLKP LFKSTSVGPL YSGCRLTLLR 

86 ERVLQGLLKP LFKSTSVGPL YSGCRLTLLR 

87 ERVLQGLLTP LFKNTSVGPL YSGCRLTLLR 

810 ERVLQGLLKP LFKNTSIGPL YSSCRLTLLR 
83 ERVLQGLLRP VFKNTSVGPL YSGCRLTLLR 
81 ERVLQGLLGP MFKNTSVGLL YSGCRLTLLR 
44 ERVLQGLLKP LFKSTSVGPL YSGCRLTLLR 

812 ERVLQGLLSP ISKNSSVGPL YSGCRLTSLR 
76 ERVLQGLLSP IFKNSSVGSL YSGCRLTLLR 

51 

79 EDLGLDRERL YWELSNLTNG IQELGPYTLD 

811 EDLGLDRERL YWELSNLTNG IQELGPYTLD 
21 LNPGLDREQL YWELSKLTRG IIELGPYLLD 
8 9 LNPGLDREQL YWELSKLTRG IIELGPYLLD 
8 5 LNPGLDREQL YWELSKLTRG IIELGPYLLD 

712 LNPGLDREQL YWELSKLTRG IIELGPYLLD 

8 6 TGPGLDRERL YWELSQLTNS VTELGPYTLD 

8 7 IGPGLDRERL YWELSQLTNS ITELGPYTLD 

810 QSPGLNREQL YWELSQLTHG ITELGPYTLD 

83 KSPGLDREQL YWELSQLTHS ITELGPYTLD 

81 KSPGLDREQL YWELSQLTHS ITELGPYTLD 

44 KRPGLDREQL YCELSQLTHD ITELGPYSLD 

812 KRPGLDREQL YWELSQLTHN ITELGPYSLD 
76 KSPGLDRERL YWKLSQLTHG ITELGPYTLD 





101 






79 


PGTSTVDVGT 


SGTPSSSPSP 


TTAGPLLMPF 


811 


PWTSTVDLGT 


SGTPSPVPSP 


TTAGPLLIPF 


21 


PGTSTVDLGT 


SGTPFSLPSP 


ATAGPLLVLF 


89 


PGTSTVHLGT 


SETPSSLPRP 


IVPGPLLIPF 


85 


PDTSTMHLAT 


SRTPASLSGP 


TTASPLLIPF 


712 


PGTSAVHLET 


FGTPASLHGH 


TAPGPVLVPF 


86 


PGTSAVHLET 


SGTPASLPGH 


TAPGPLLVPF 


87 


PGTSTVHLAT 


SGTPSSLPGH 


TAPVPLLIPF 


810 


PGTSIVNLGT 


SGIPPSLPET 


TATGPLLIPF 


83 


PGTPTVDLGT 


SGTPVSKPGP 


SAASPLLVPF 


81 


PGTPTVDLGT 


SGTPVSKPGP 


SAASPLLIPF 


44 


PGTSTVYWAT 


TGTPSSFPGH 


TEPGPLLIPF 


812 


PGTSTVYWAT 


TGTPSSFPGH 


TEPGPLLIPF 


76 


PDTSTMHLAT 


SRTPASLSGP 


TTASPLLVLF 
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PEKDSSAMAV 


DAICTHRPDP 


(SEQ 


ID 


NO: 


22) 


P E KD b b AJyiA V 


DAIL ihLKPJJF 




lu 


NU : 


23; 


PEKRGAATGV 


DTICTHRLDP 


(SEQ 


ID 


NO: 


24) 


PEKRGAATGV 


DTICTHRLDP 


(SEQ 


ID 


NO: 


25) 


PEKRGAATGV 


DTICTHRLDP 


(SEQ 


ID 


NO: 


26) 


PEKRGAATGV 


DTICTHRLDP 


(SEQ 


ID 


NO: 


27) 


PEKHGAATGV 


DAICTLRLDP 


(SEQ 


ID 


NO: 


28) 


PEKQEAATGV 


DTICTHRVDP 


(SEQ 


ID 


NO: 


29) 


PEKDKAATRV 


DAICTHHPDP 


(SEQ 


ID 


NO: 


30) 


PKKDGAATKV 


DAICTYRPDP 


(SEQ 


ID 


NO: 


31) 


PKKDGAATKV 


DAICTYRPDP 


(SEQ 


ID 


NO: 


32) 


PEKDGAATGM 


DAVCLYHPNP 


(SEQ 


ID 


NO: 


33) 


PEKDGAATGM 


DAVCLYHPNP 


(SEQ 


ID 


NO: 


34) 


PEKDGAATRV 


DAVCTHRPDP 


(SEQ 


ID 


NO: 


35) 
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RNSLYVNGFT 


HRSSMPTTST 










RNSLYVNGFT 


HRSSGLTTST 










RGSLYVNGFT 


HRTSVPTTST 










RGSLYVNGFT 


HRNFVPITST 










RGSLYVNGFS 


RQSSMTTTRT 










RDSLYVNGFT 


HRSSVPTTSI 










RDSLYVNGFT 


HRSSVPTTSI 










RDSLYVNGFN 


PWSSVPTTST 










RDSLYVDGFT 


HWSPIPTTST 










RDSLYVNGFT 


QRSSVPTTSI 










RDSLYVNGFT 


QRSSVPTTSI 










RDSLYVNGFT 


HQNSVPTTST 










RDSLYVNGFT 


HQNSVPTTST 










RHSLYVNGFT 


HQSSMTTTRT 












150 










TLNFTITNLQ 


YEEDMRRTGS 










TLNFTITNLQ 


YEENMGHPGS 










TLNFTITNLK 


YEEDMHRPGS 










TINFTITNLR 


YEENMHHPGS 










TLNFTITNLQ 


YEENMGHPGS 










TLNFTITNLQ 


YEEDMRHPGS 










TLNFTITNLQ 


YEEDMRHPGS 










TLNFTITNLK 


YEENMQHPGS 










TPNFTITNLQ 


YEEDMRRTGS 










TLNFTITNLQ 


YEEDMHRPGS 










TINFTITNLR 


YEENMGHPGS 










TFNFTITNLH 


YEENMQHPGS 










TVNFTITNLR 


YEENMHHPGS 










TINFTITNQR 


YEENMHHPGS 
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TABLE 4 -continued 



Amino Acid Sequence for a 800 bp Repeat in the CA125 Molecule 
(SEQ ID NO: 22 thru SEQ ID NO: 35) 



151 200 

79 RKFNTMERVL QGLLSPIFKN SSVGPLYSGC RLTSLRPEKD GAATGMDAVC 

811 RKFNIMERVL QGLLMPLFKN TSVSSLYSGC RLTLLRPEKD GAATRVDAVC 
21 RKFNTTERVL QTLLGPMFKN TSVGLLYSGC RLTLLRSEKD GAATGVDAIC 

8 9 RKFNIMERVL QGLLGPLFKN SSVGPLYSGC RLISLRSEKD GAATGVDAIC 

85 RKFNIMERVL QGLLNPIFKN SSVGPLYSGC RLTSLKPEKD GAATGMDAVC 
712 RKFNTTERVL QGLLKPLFKS TSVGPLYSGC RLTLLRPEKR GAATGVDTIC 

86 RKFNTTERVL QGLLKPLFKS TSVGPLYSGC RLTLLRPEKR GAATGVDTIC 

87 RKFNTTERVL QGLLKPLFKS TSVGPLYSGC RLTLLRPEKH GAATGVDAIC 

810 RKFNTMERVL QGLLSPIFKM SSVGPLYSGC RLTSLRPEKD GAATGMDAVC 
83 RKFNATERVL QGLLSPIFKN SSVGPLYSGC RLTSLRPEKD GAATGMDAVC 
81 RKFNIMERVL QGLLKPLFKN TSVGPLYSGC RLTLLRPKKD GAATGVDAIC 
44 RKFNTTERVL QGLLKPLFKN TSVGPLYSGC RLTLLRPEKH EAATGVDTIC 

812 RKFNTTERVL QGLLRPVFKN TSVGPLYSGC RLTLLRPKKD GAATKVDAIC 
76 RKFNTTERVL QGLLRPVFKN TSVGPLYSGC RLTLLRPKKD GAATKVDAIC 

201 250 

7 9 LYHPNPKRPG LDREQLYWEL SQLTHNITEL GPYSLDRDSL YVNGFTHQNS 

811 TQRPDPKSPG LDRERLYWKL SQLTHGITEL GPYTLDRHSL YVNGLTHQSS 
21 THRLDPKSPG VDREQLYWEL SQLTNGIKEL GPYTLDRNSL YVNGFTHWIP 

8 9 THHLNPQSPG LDREQLYWQL SQMTNGIKEL GPYTLDRNSL YVNGFTHRSS 

85 LYHPNPKRPG LDREQLYWEL SQLTHGIKEL GPYTLDRNSL YVNGFTHRSS 
712 THRLDPLNPG LDREQLYWEL SKLTRGIIEL GPYLLDRGSL YVNGFTHRNF 

86 THRLDPLNPG LDREQLYWEL SKLTRGIIEL GPYLLDRGSL YVNGFTHRNF 

87 THRLDPKSPG VDREQLYWEL SQLTNGIKEL GPYTLDRNSL YVNGFTHWIP 

810 LYHPNPKRPG LDREQLY 

83 LYHPNPKRPG LDREQLYWEL SQLTHNITEL GPYSLDRDSL YVNGFTHQSS 

81 THRLDPKSPG LNREQLYWEL SKLTNDIEEL GPYTLDRNSL YVNGFTHQSS 

44 THRVDPIGPG LDRERLYWEL SQLTNSIHEL GPYTLDRDSL YVNGFNPRSS 

812 TYRPDPKSPG LDREQLYWEL SKLTNDIEEL GPYTLDRNSL YVNGFTHQSS 
76 TYRPDPKSPG LDREQLYWEL SQLTHSITEL GPYTQDRDSL YVNGFTHRSS 

251 288 

79 VPTTSTPGTS TVYWATTGTP SSFPGHT,.E PGPL 

811 MTTTRTPDTS TMHLATSRTP ASLSGPT. .T ASPLLIPF 

21 

8 9 GLTTSTPWTS TVDLGTSGTP SPVPSPT. .T AGPLLIPF 

85 VAPTSTPGTS TVDLGTSGTP SSLPSPT..T AVPLLIPF 

712 VPITSTPGTS TVHLGTSETP SSLPRPI..V PGPLLIPF 

8 6 VPITSTPGTS TVHLGTSETP SSLPRPI..V PGPLLIPF 

87 VPTSSTPGTS TVDLG.SGTP SSLPSPT..T AGPL 

810 

83 MTTTRTPDTS TMHLATSRTP ASLSGPT. ,T ASPLLIPF 

81 VSTTSTPGTS TVDLRTSGTP SSLSSPTIMA AGPLLIPF 

44 VPTTSTPGTS TVHLATSGTP SSLPGHT. .A PVPLLI^- 

812 VSTTSTPGTS TVDLRTSGTP SSLSSPTIMA AGPLLIPF 
76 VPTTSTPGTS AVHLETSGTP ASLP 
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TABLE 5 



Amino Acid Sequence for a 1200 bp Repeat in the CA125 Molecule 
(SEQ ID NO: 36 thru SEQ ID NO: 46) 



1 50 



910 


ERVLQGLLGP 


MFKNTSVGLL 


YSGCRLTLLR 


PEKRGAATGV 


DTICTHRLDP 


(SEQ 


ID 


NO: 


36) 


99 


ERVLHGLLTP 


LFKNTRVGPL 


YSGCRLTLLR 


PEKQEAATGV 


DTICTHRVDP 


(SEQ 


ID 


NO: 


37) 


112 




GPL 


YSGCRLTSLR 


PEKDGAATGM 


DAVCLYHPNP 


(SEQ 


ID 


NO: 


38) 


95 


ERVLQGPLSP 


IFKNSSVGPL 


YSGCRLTSLR 


PEKDGAATGM 


DAVCLYHPNP 


(SEQ 


ID 


NO: 


39) 


71 




TSVGPL 


YSGCRLTLLR 


SEKDGAATGV 


DAIYTHRLDP 


(SEQ 


ID 


NO: 


40) 


78 






TLLR 


PKKDGVATGV 


DAICTHRLDP 


(SEQ 


ID 


NO: 


41) 


115 


ERVLQGLLKP 


LFKSTSVGPL 


YSGCRLTLLR 


PEKDGVATRV 


DAICTHRPDP 


(SEQ 


ID 


NO: 


42) 


91 


ERVLQGLLKP 


LFRNSSLEYL 


YSGCRLASLR 


PEKDSSAMAV 


DAICTHRPDP 


(SEQ 


ID 


NO: 


43) 


92 


ERVLQGLLKP 


LFKSTSVGPL 


YSGCRLTLLR 


PEKRGAATGV 


DTICTHRLDP 


(SEQ 


ID 


NO: 


44) 


113 


ERVLQGLLGP 


MFKNTSVGLL 


YSGCRLTLLR 


PEKNGAATGM 


DAICSHRLDP 


(SEQ 


ID 


NO: 


45) 


711 


ERVLQGLLKP 


LFKSTSVGPL 


YSGCRLTLLR 


PEKHGAATGV 


DAICTLRLDP 


(SEQ 


ID 


NO: 


46) 
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100 


910 


LNPGLDREQL 


YWELSKLTRG 


TTELGPYLLD 


RGSLYVNGFT 


HRNFVPITST 


99 


IGPGLDRERL 


YWELSQLTNS 


TTELGPYTLD 


RDSLYVNGFN 


PWSSVPTTST 


112 


KRPGLDREQL 


YWELSQLTHM 


TTELGPYSLD 


RDSLYVNGFT 


HQNSVPTTST 


95 


KRPGLDREQL 


YWELSQLTHM 


TTELGPYSLD 


RDSLYVNGFT 


HQNSVPTTST 


71 


KSPGVDREQL 


YWELSQLTNG 


IKELGPYTLD 


RNSLYVNGFT 


HQTSAPNTST 


78 


KSPGLMREQL 


YWELSKLTND 


lEELGPYTLD 


RNSLYVNGFT 


HQSSVSTTST 


115 


KIPGLDRQQL 


YWELSQLTHS 


TTELGPYTLD 


RDSLYVNGFT 


QRSSVPTTST 


91 


EDLGLDRERL 


YWELSNLTNG 


TQELGPYTLD 


RNSLYVNGFT 


HRSSMPTTST 


92 


LNPGLDREQL 


YWELSKLTRG 


TTELGPYLLD 


RGSLYVNGFT 


HRNFVPITST 


113 


KSPGLNREQL 


YWELSQLTHG 


TKELGPYTLD 


RNSLYVNGFT 


HRSSVAPTST 


711 


TGPGLDRERL 


YWELSQLTNS 


VTELGPYTLD 


RDSLYVNGFT 


HRSSVPTTST 
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150 


910 


PGTSTVHLGT 


SETPSSLPRP 


IV, .PGPLLV 


PFTLNFTITN 


LQYEEAMRHP 


99 


PGTSTVHLAT 


SGTPSSLPGH 


TA. .PVPLLI 


PFTLNFTTTN 


LHYEENMQHP 


112 


PGTSTVYWAT 


TGTPSSFPGH 


T. .EPGPLLI 


PFTLNFTITN 


LQYEENMGHP 


95 


PGTSTVYWAT 


TGTPSSFPGH 


T. .EPGPLLI 


PFTLNFTITN 


LQYEENMGHP 


71 


PGTSTVDLGT 


SGTPSSLPSP 


T. .SAGPLLI 


PFTTNFTITN 


LRYEENMHHP 


78 


PGTSTVDLRT 


SGTPSSLSSP 


TIMAAGPLLI 


PFTTNFTITN 


LRYEENMHHP 


115 


PGTFTVQPET 


SETPSSLPGP 


T. .ATGPVLL 


PFTLNFTITN 


LQYEEDMHRP 


91 


PGTSTVDVGT 


SGTPSSSPSP 


T . . TAGPLLM 


PFTLNFTITN 


LQYEEDMRRT 


92 


PGTSTVHLGT 


SETPSSLPRP 


TV. .PGPLLI 


PFTLNFTITN 


LQYEENMGHP 


113 


PGTSTVDLGT 


SGTPSSLPSP 


T. .TAVPLLT 


PFTLNFTITN 


LKYEEDMHCP 


711 


PGTSAVHLET 


SGTPASLPGH 


T. .APGPLLT 


PFTLNFTITN 


LHYEENMQHP 
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200 


910 


GSRKFNTTER 


VLQGLLRPLF 


KNTSVSSLYS 


GCRLTLLRPE 


KDGAATRVDA 


99 


GSRKFNTTER 


VLQGLLKPLF 


KNTSVGPLYS 


GCRLTLFKPE 


KHEAATGVDA 


112 


GSRKFNITES 


VLQGLLTPLF 


KNSSVGPLYS 


GCRLISLRSE 


KDGAATGVDA 


95 


GSRKFNTTER 


VLQGLLNPTF 


KNSSVGPLYS 


GCRLTSLRPE 


KDGAATGMDA 


71 


GSRKFNTMER 


VLQGLLKPLF 


KSTSVGPLYS 


GCRLTLLRPE 


KDGVATRVDA 


78 


GSRKFNTMER 


VLQGLLMPLF 


KNTSVSSLYS 


GCRLTLLRPE 


KDGAATRVDA 


115 


GSRKFNTTER 


VLQGLLMPLF 


KNTSVGPLYS 


GCRLTLLRPE 


KQEAATGVDT 


91 


GSRKFNTMES 


VLQGLLKPLF 


KNTSVGPLYS 


GCRLTLLRPK 


KDGAATGVDA 


92 


GSRKFNTTER 


VLQGLLKPLF 


RNSSLEYLYS 


GCRLTSLRPE 


KDSSTMAVDA 
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TABLE 5 -continued 



Amino Acid Sequence for a 12 00 bp Repeat in the CA12 5 Molecule 
(SEQ ID NO: 36 thru SEQ ID NO; 46) 



113 GSRKFNTTER VLQSLFGPMF KNTSVGPLYS GCRLTLFRSE KDGAATGVDA 

711 GSRKFNTMER VLQGCLVPCS RNTNVGLLYS GCRLTLLXXX XXXXXXXXXX 

201 250 

910 ACTYRPDPKS PGLDREQLYW ELSQLTHSIT ELGPYTLDRV SLYVNGFNPR 

99 ICTLRLDPTG PGLDRERLYW ELSQLTNSVT ELGPYTLDRD SLYVNGFTHR 

112 ICTHHLMPQS PGLDREQLYW QLSQMTNGIK ELGPYTLDRD SLYVNGFTHR 
95 VCLYHPNPKR PGLDREQLYC ELSQLTHNIT ELGPYSLDRD SLYVNGFTHQ 
71 ICTHRPDPKI PGLDRQQLYW ELSQLTHSIT ELGPYTLDRD SLYVNGFTQR 
7 8 VCTHRPDPKS PGLDRERLYW KLSQLTHGIT ELGPYTLDRN SLYVNGFTHR 

115 ICTHRLDPSE PGLDREQLYW ELSQLTNSIT ELGPYTLDRD SLYVNGFTHS 

91 ICTHRLDPKS PGLNREQLYW ELSKLTNDIE EVGPYTLDRN SLYVNGFTHR 

92 ICTHRPDPED LGLDRERLYW ELSNLTNGIQ ELGPYTLDRN SLYVNGFTHR 

113 ICTHRLDPKS PGVDREQLYW ELSQLTNGIK ELGPYTLDRN SLYVNGFTHQ 
711 XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXGPYTLDRN SLYVNGFTHR 

251 300 

910 SSV.PTTSTP GTSTVHLATS GTPSSLPGHT APVPLLIPFT LNFTITNLQY 

99 SSV. PTTSIP GTSAVHLETS GTPASLPGHT APGPLLIPFT LNFTITNLQY 

112 SL.GLTTSTP WTSTVDLGTS GTPSPVPSPT TAGPLLIPFT LNFTITNLQY 
95 NS.VPTTSTP GTSTVYWATT GTPSSFPGHT EPGPLLIPFT LNFTITNLQY 
71 SSV.PTTSTP GTFTVQPETS ETPSSLPGPT ATGPVLLPFT LNFTITNLQY 
78 SSM.PTTSTP GTSTVDVGTS GTPSSSPSPT TAGPLLMPFT LNFTITNLQY 

115 GVLCPPPSIL GIFTVQPETF ETPSSLPGPT ATGPVLLPFT LNFTITNLQY 

91 SFVAP.TSTL GTSTVDLGTS GTPSSLPSPT TGVPLLIPFT LNFTITNLQY 

92 SFM.PTTSTL GTSTVDVGTS GTPSSSPSPT TAGPLLMPFT LNFTITNLQY 

113 TS.APNTSTP GTSTVDLGTS GTPSSLPSPT SAGPLLVPFT LNFTITNLQY 
711 SSVAP.TSTP GTSTVDLGTS GTPSSLPSPT TV.PLLVPFT LNFTITNLQY 

301 350 

910 EEDMRHPGSR KFNTMERVLQ GLLRPLFKNT SIGPLYSSCR LTLLRPEKDK 

99 EEDMRRTGSR KFNTMERVLQ GLLKPLFKST SVGPLYSGCR LTLLRPEKRG 

112 EENMGHPGSR KFNIMERVLQ GLLRPVFKNT SVGPLYSGCR LTLLRPKKDG 
95 EEDMRRTGSR KFNTMERVLQ GLLKPLFKST SVGPLYSGCR LTLLRPEKHG 
71 EEDMHRPGSR KFNTTERVLQ GLLKPLFKST SVGPLYSGCR LTLLRPEKHG 
7 8 EEDMRRTGSR KFNTMERVLQ GLLKPLFKST SVGPLYSGCR LTLLRPEKHG 

115 EEDMHRPGSR KFNTTERVLQ GLLMPLFKNT SVGPLYSGCR LTLLRPEKQE 

91 EENMGHPGSR KFNIMERVLQ GLLMPLFKNT SVSSLYSGCR LTLLRPEKDG 

92 EEDMRRTGSR KFNTMESVLQ GLLKPLFKNT SVGPLYSGCR LTLLRPKKDG 

113 EEDMRRTGSR KFNTMESVLQ GLLKPLFKNT SVGPLYSGCR LTLLRPEKDG 
711 GEDMRHPGSR KFNTTERVLQ GLLGPLFKNS SVGPLYSGCR LISLRSEKDG 

351 400 

910 AATRVDAICT HHPDPQSPGL NREQLYWELS QLTHGITEL- 

99 AATGVDTICT HRLDPLNPGL DREQLYWELS KLTRGIIELG PYLLDRGSLY 

112 AATKVDAICT YRPDPKSPGL DREQLYWELS QLTHSITELG PYTLDRDSLY 

95 AATGVDAICT LRLDPTGPGL DRERLYWELS QLTNSVTELG PYTLDRDSLY 

71 AATGVDAICT LRLDPTGPGL DRERLYWELS QLTNSITELG PYTLDRDSLY 

7 8 AATGVDAICT LRLDPTGPGL DRERLYWELS QLTNSVTELG PYTLDRDSLY 
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TABLE 5 -continued 



Amino Acid Sequence for a 12 00 bp Repeat in the CA125 Molecule 
(SEQ ID NO: 3 6 thru SEQ ID NO: 46) 



115 AATGVDTICT HRVDPIGPGL DRERLYWELS QLTNSITELG PYTLDRDSLY 

91 AATRWAVCT HRPDPKSPGL DRERLYWKLS QLTHGITELG PYTLDRHSLY 

92 AATGVDAICT HRLDPKSPGL NREQLYWELS KLTNDIEELG PYTLDRNSLY 

113 AATGVDAICT HRLDPKSPGL NREQLYWELS KL 

711 AATGVDAICT HHLNPQSPGL DREQLYWQLS QVTNGIKELG PYTLDRNSLY 

401 447 

910 

99 VNGFTHRNFV PITSTPGTST VHLGTSEIHP SLPRPI..VP GPL 

112 VNGFTQRSSV PTTSIPGTPT VDLGTSGTPV SKPGPS..AA SP 

95 VNGFTHRSSV PTTSIPGTSA VHLETSGTPA SLPGHT..AP GPLL 

71 VNGFNPWSSV PTTSTPGTST VHLATSGTPS SLPGHT. .AP VPL 

78 VNGFTHRSSV PTTSIPGTSA VHLETSGTPA SLPGHT.. AP GPLLIPF 

115 VNGFNPWSSV PTTSTPGTST VHLATSGTPS SLPGHT. .AP VPLLIPF 

91 VNGFTHQSSM TTTRTPDTST MHLATSRTPA SLSGPT..TA SPLLIPF 

92 VNGFTHQSSV STTSTPGTST VDPRTSGTPS SLSSPTIMAA GPLLI 

113 

711 VNGFTHRSSG LTTSTPWTST VDLGTSGTPS PVPSPT . . TA GPLLI 
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TABLE 6 



Amino Acid Sequence for a 9 Repeat Structure in the CA125 Molecule 

(SEQ ID NO: 47) 



ERVLQGLLKP LFRNSSLEYL YSGCRLASLR PEKDSSAiyiAV DAICTHRPDP 
EDLGLDRERL YWELSNLTNG IQELGPYTLD RNSLYVNGFT HRSSMPTTST 
PGTSTVDVGT SGTPSSSPSP TTAGPLLMPF TLNFTITNLQ YEEDMRRTGS 
RKFNTMERVL QGPLSPIFKN SSVGPLYSGC RLTSLRPEKD GAATGM DAV 
GLYHPNPKRP GLDREQLYWE LSQLTHNITE LGPYSLDRDS LYVNGFTHQN 
SVPTTSTPGT STVYWATTGT PSSFPGHTEP GPLLIPFTLM FTITNLQYEE 
NMGHPGSRKF NITERVLQGL LNPIFKNSSV GPLYSGCRLT SLRPEiCDGAA 
TGMDAVCLYH PNPKRPGLDR EQLYCELSQL THNITELGPY SLDRDSLYVN 
GFTHQNSVPT TSTPGTSTVY WATTGTPSSF PGHTEPGPLL IPFTLNFTIT 
NLQYEEDMRR TGSRKFNTME RVLQGLLKPL FKSTSVGPLY SGCRLTLLRP 
EKHGAATGVD AICTLRLDPT GPGLDRERLY WELSQLTNSV TELGPYTLDR 
DSLYVNGFTH RSSVPTTSIP GTSAVHLETS GTPASLPGHT APGPLLVPFT 
LNFTITNLQY EEDMRHPGSR KFNTTERVLQ GLLKPLFKST SVGPLYSGCR 
LTLLRPEKRG AATGVDTICT HRLDPLNPGL DREQLYWELS KLTRGIIELG 
PYLLDRGSLY VNGFTHRNFV PITSTPGTST VHLGTSETPS SLPRPIVPGP 
LLIPFTLNFT ITNLQYEENM GHPGSRKFNI TERVLQGLLK PLFRNSSLEY 
LYSGfRLASL RPEKDSSAMA VDAICTHRPD PEDLGLDRER LYWELSNLTN 
GIQELGPYTL DRNSLYVNGF THRSSMPTTS TPGTSTVDVG TSGTPSSSPS 
PTTAGPLLMP FTLNFTITNL QYEEDMRRTG SRKFMTMESV LQGLLKPLFK 
NTSVGPLYSG CRLTLLRPKK DGAATGVDAI CTHRLDPKSP GLMREQLYWE 
LSKLTNDIEE VGPYTLDRNS LYVNGFTHRS FVAPTSTLGT STVDLGTSGT 
PSSLPSPTTG VPLLIPFTLN FTITNLQYEE NMGHPGSRKF NIMERVLQGL 
LSPIFKNSSV GSLYSGCRLT LLRPEKDGAA TRVDAVCTHR PDPKSPGLDR 
ERLYWKLSQL THGIIELGPY TLDRHSFYVN GFTHQSSMTT TRTPDTSTMH 
LATSRTPASL SGPTTASPLL VLFTINFTIT NQRYEENMHH PGSRKFNTTE 
RVLQGLLRPV FKNTSVGPLY SGCRLTLLRP KKDGAATKVD AICTYRPDPK 
SPGLDREQLY WELSQLTHSI TELGPYTQDR DSLYVNGFTH RSSVPTTSIP 
GTSAVHLETS GTPASLP 
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TABLE 7 



cDNA Genbank Accession # AK024365 Encompasses Repeat Sequences (Repeats 1 & 2) 
Homologous to Two Repeats Shown in Table 6 
(SEQ ID NO: 48) 



MPLFKNTSVS SLYSGCRLTL LRPEKDGAAT RVDAVCTHRP DPKSPGLDRE 

RLYWKLSQLT HGIIELGPYT LDRHSFYVNG FTHQSSMTTT RTPDTSTMHL 

ATSRTPASLS GPTTASPLLV LFTINFTITN QRYEENMHHP GSRKFNTTER 

VLQGLLRPVF KNTSVGPLYS GCRLTLLRPK KDGAATKVDA ICTYRPDPKS 

PGLDREQLYW ELSQLTHSIT ELGPYTQDRD SLYWGFTHR SSVPTTSIPG 

TSAVHLETSG TPASLPGPSA ASPLLVLFTL NFTITNLRYE ENMQHPGSRK 

FNTTERVLQG LLRSLFKSTS VGPLYSGCRL TLLRPEKDGT ATGVDAICTH 

HPDPKSPRLD REQLYWELSQ LTHNITELGH YALDNDSLFV NGFTHRSSVS 

TTSTPGTPTV YLGASKTPAS IFGPSAASHL LILFTLNFTI TNLRYEENMW 

PGSRKFNTTE RVLQGLLRPL FKNTSVGPLY SGSRLTLLRP EiCDGEATGVD 

AICTHRPDPT GPGLDREQLY LELSQLTHSI TELGPYTLDR DSLYVNGFTH 

RSSVPTTSTG WSEEPFTLN FTINNLRYMA DMGQPGSLKF NITDNVMKHL 

LSPLFQRSSL GARYTGCRVI ALRSVKNGAE TRVDLLCTYL QPLSGPGLPI 

KQVFHELSQQ THGITRLGPY SLDKDSLYLN GYNEPGLDEP PTTPKPATTF 

LPPLSEATTA MGYHLKTLTL NFTISNLQYS PDMGKGSATF NSTEGVLQHL 

LRPLFQKSSM GPFYLGCQLI SLRPEKDGAA TGVDTTCTYH PDPVGPGLDI 

QQLYWELSQL THGVTQLGFY VLDRDSLFIN GYAPQNLSIR GEYQINFHIV 

NWNLSNPDPT SSEYITLLRD IQDKVTTLYK GSQLHDTFRF CLVTNLTMDS 

VLVTVKALFS SNLDPSLVEQ VFLDKTLNAS FHWLGSTYQL VDIHVTEMES 

SVYQPTSSSS TQHFYLNFTI TNLPYSQDKA QPGTTNYQRN KRNIEDALNQ 

LFRNSSIKSY FSDCQVSTFR SVPNRHHTGV DSLCNFSPLA RRVDRVAIYE 

EFLRMTRNGT QLQNFTLDRS SVLVDGYSPN RNEPLTGNSD LPFWAVILIG 

LAGLLGLITC LICGVLVTTR RRKKEGEYNV QQQCPGYYQS HLDLEDLQ 
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TABLE 8 



Complete DNA Sequence for 13 Repeats including the Carboxy Terminus of CA125 

(SEQ ID NO: 49) 

1 GAGAGGGTTC TGCAGGGTCT GCTCAAACCC TTGTTCAGGA ATAGCAGTCT 

51 GGAATACCTC TATTCAGGCT GCAGACTAGC CTCACTCAGG CCAGAGAAGG 

101 ATAGCTCAGC CATGGCAGTG GATGCCATCT GCACACATCG CCCTGACCCT 

151 GAAGACCTCG GACTGGACAG AGAGCGACTG TACTGGGAGC TGAGCAATCT 

2 01 GACAAATGGC ATCCAGGAGC TGGGCCCCTA CACCCTGGAC CGGAACAGTC 
251 TCTATGTCAA TGGTTTCACC CATCGAAGCT CTATGCCCAC CACCAGCACT 

3 01 CCTGGGACCT CCACAGTGGA TGTGGGAACC TCAGGGACTC CATCCTCCAG 
351 CCCCAGCCCC ACGACTGCTG GCCCTCTCCT GATGCCGTTC ACCCTCAACT 
401 TCACCATCAC CAACCTGCAG TACGAGGAGG ACATGCGTCG CACTGGCTCC 
451 AGGAAGTTCA ACACCATGGA GAGGGTTCTG CAGGGTCCGC TTAGTCCCAT 
501 ATTCAAGAAC TCCAGTGTTG GCCCTCTGTA CTCTGGCTGC AGACTGACCT 
551 CTCTCAGGCC CGAGAAGGAT GGGGCAGCAA CTGGAATGGA TGCTGTCTGC 
601 CTCTACCACC CTAATCCCAA AAGACCTGGG CTGGACAGAG AGCAGCTGTA 
651 CTGGGAGCTA AGCCAGCTGA CCCACAACAT CACTGAGCTG GGCCCCTACA 
7 01 GCCTGGACAG GGACAGTCTC TATGTCAATG GTTTCACCCA TCAGAACTCT 
751 GTGCCCACCA CCAGTACTCC TGGGACCTCC ACAGTGTACT GGGCAACCAC 
801 TGGGACTCCA TCCTCCTTCC CCGGCCACAC AGAGCCTGGC CCTCTCCTGA 
851 TACCATTCAC GCTCAACTTC ACCATCACTA ACCTACAGTA TGAGGAGAAC 
901 ATGGGTCACC CTGGCTCCAG GAAGTTCAAC ATCACGGAGA GGGTTCTGCA 
951 GGGTCTGCTT AATCCCATTT TCAAGAACTC CAGTGTTGGC CCTCTGTACT 

1001 CTGGCTGCAG ACTGACCTCT CTCAGGCCCG AGAAGGATGG GGCAGCAACT 

1051 GGAATGGATG CTGTCTGCCT CTACCACCCT AATCCCAAAA GACCTGGGCT 

1101 GGACAGAGAG CAGCTGTACT GCGAGCTAAG CCAGCTGACC CACAACATCA 

1151 CTGAGCTGGG CCCCTACAGC TTGGACAGGG ACAGTCTTTA TGTCAATGGT 
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TABLE 8 -continued 



Complete DNA Sequence for 13 Repeats including the Carboxy Terminus of CA12 5 

(SEQ ID NO: 49) 

12 01 TTCACCCATC AGAACTCTGT GCCCACCACC AGTACTCCTG GGACCTCCAC 
1251 AGTGTACTGG GCAACCACTG GGACTCCATC CTCCTTCCCC GGCCACACAG 

13 01 AGCCTGGCCC TCTCCTGATA CCATTCACCC TCAACTTCAC CATCACCAAC 
1351 CTGCAGTACG AGGAGGACAT GCGTCGCACT GGCTCCAGGA AGTTCAACAC 
1401 CATGGAGAGG GTTCTGCAGG GTCTGCTCAA GCCCTTGTTC AAGAGCACCA 
1451 GCGTTGGCCC TCTGTACTCT GGCTGCAGAC TGACCTTGCT CAGACCTGAG 
1501 AAACATGGGG CAGCCACTGG AGTGGACGCC ATCTGCACCC TCCGCCTTGA 
1551 TCCCACTGGT CCTGGACTGG ACAGAGAGCG GCTATACTGG GAGCTGAGCC 
1601 AGCTGACCAA CAGCGTTACA GAGCTGGGCC CCTACACCCT GGACAGGGAC 
1651 AGTCTCTATG TCAATGGCTT CACCCATCGG AGCTCTGTGC CAACCACCAG 

17 01 TATTCCTGGG ACCTCTGCAG TGCACCTGGA AACCTCTGGG ACTCCAGCCT 
1751 CCCTCCCTGG CCACACAGCC CCTGGCCCTC TCCTGGTGCC ATTCACCCTC 

18 01 AACTTCACTA TCACCAACCT GCAGTATGAG GAGGACATGC GTCACCCTGG 
1851 TTCCAGGAAG TTCAACACCA CGGAGAGAGT CCTGCAGGGT CTGCTCAAGC 
1901 CCTTGTTCAA GAGCACCAGT GTTGGCCCTC TGTACTCTGG CTGCAGACTG 
1951 ACCTTGCTCA GGCCTGAAAA ACGTGGGGCA GCCACCGGCG TGGACACCAT 
2 001 CTGCACTCAC CGCCTTGACC CTCTAAACCC TGGACTGGAC AGAGAGGAGC 
2 051 TATACTGGGA GCTGAGCAAA CTGACCCGTG GCATCATCGA GCTGGGCCCC 
2101 TACCTCCTGG ACAGAGGCAG TCTCTATGTC AATGGTTTCA CCCATCGGAA 
2151 CTTTGTGCCC ATCACCAGCA CTCCTGGGAC CTCCACAGTA CACCTAGGAA 
2 201 CCTCTGAAAC TCCATCCTCC CTACCTAGAC CCATAGTGCC TGGCCCTCTC 
2 251 CTGATACCAT TCACACTCAA CTTCACCATC ACTAACCTAC AGTATGAGGA 
23 01 GAACATGGGT CACCCTGGCT CCAGGAAGTT CAACATCACG GAGAGGGTTC 
2 351 TGCAGGGTCT GCTCAAACCC TTGTTCAGGA ATAGCAGTCT GGAATACCTC 
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TABLE 8 -continued 



Complete DNA Sequence for 13 Repeats including the Carboxy Terminus of CA12 

(SEQ ID NO: 49) 

24 01 TATTCAGGCT GCAGACTAAC CTCACTCAGG CCAGAGAAGG ATAGCTCAAC 

2451 CATGGCAGTG GATGCCATCT GCACACATCG CCCTGACCCT GAAGACCTCG 

2501 GACTGGACAG AGAGCGACTG TACTGGGAGC TGAGCAATCT GACAAATGGC 

2 551 ATCCAGGAGC TGGGCCCCTA CACCCTGGAC CGGAACAGTC TCTATGTCAA 

2 601 TGGTTTCACC CATCGAAGCT CTATGCCCAC CACCAGCACT CCTGGGACCT 

2 651 CCACAGTGGA TGTGGGAACC TCAGGGACTC CATCCTCCAG CCCCAGCCCC 

2 701 ACGACTGCTG GCCCTCTCCT GATGCCGTTC ACCCTCAACT TCACCATCAC 

2 751 CAACCTGCAG TACGAGGAGG ACATGCGTCG CACTGGCTCC AGGAAGTTCA 

2 801 ACACCATGGA GAGTGTCCTG CAGGGTCTGC TCAAGCCCTT GTTCAAGAAC 

2 851 ACCAGTGTTG GCCCTCTGTA CTCTGGCTGC AGATTGACCT TGCTCAGGCC 

2 901 CAAGAAAGAT GGGGCAGCCA CTGGAGTGGA TGCCATCTGC ACCCACCGCC 

2 951 TTGACCCCAA AAGCCCTGGA CTCAACAGGG AGCAGCTGTA CTGGGAGTTA 

3 001 AGCAAACTGA CCAATGACAT TGAAGAGGTG GGCCCCTACA CCTTGGACAG 
3 051 GAACAGTCTC TATGTCAATG GTTTCACCCA TCGGAGCTTT GTGGCCCCCA 
3101 CCAGCACTCT TGGGACCTCC ACAGTGGACC TTGGGACCTC AGGGACTCCA 
3151 TCCTCCCTCC CCAGCCCCAC AACAGGTGTT CCTCTCCTGA TACCATTCAC 
3 201 ACTCAACTTC ACCATCACTA ACCTACAGTA TGAGGAGAAC ATGGGTCACC 
3 251 CTGGCTCCAG GAAGTTCAAC ATCATGGAGA GGGTTCTGCA GGGTCTGCTT 
33 01 ATGCCCTTGT TCAAGAACAC CAGTGTCAGC TCTCTGTACT CTGGTTGCAG 
3 351 ACTGACCTTG CTCAGGCCTG AGAAGGATGG GGCAGCCACC AGAGTGGTTG 
3401 CTGTCTGCAC CCATCGTCCT GACCCCAAAA GCCCTGGACT GGACAGAGAG 
3451 CGGCTGTACT GGAAGCTGAG CCAGCTGACC CACGGCATCA CTGAGCTGGG 
3 501 CCCCTACACC CTGGACAGGC ACAGTCTCTA TGTCAATGGT TTCACCCATC 
3551 AGAGCTCTAT GACGACCACC AGAACTCCTG ATACCTCCAC AATGCACCTG 
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TABLE 8 -continued 



Complete DNA Sequence for 13 Repeats including the Carboxy Terminus of CA12 5 

(SEQ ID NO: 49) 

3 601 GCAACCTCGA GAACTCCAGC CTCCCTGTCT GGACCTACGA CCGCCAGCCC 

3651 TCTCCTGATA CCATTCACAA TTAACTTCAC CATCACTAAC CTGCGGTATG 

37 01 AGGAGAACAT GCATCACCCT GGCTCTAGAA AGTTTAACAC CACGGAGAGA 

3 7 51 GTCCTTCAGG GTCTGCTCAG GCCTGTGTTC AAGAACACCA GTGTTGGCCC 

3 8 01 TCTGTACTCT GGCTGCAGAC TGACCTTGCT CAGGCCCAAG AAGGATGGGG 

3 8 51 CAGCCACCAA AGTGGATGCC ATCTGCACCT ACCGCCCTGA TCCCAAAAGC 

3 901 CCTGGACTGG ACAGAGAGCA GCTATACTGG GAGCTGAGCC AGCTAACCCA 

3 951 CAGCATCACT GAGCTGGGCC CCTACACCCT GGACAGGGAC AGTCTCTATG 
40 01 TCAATGGTTT CACACAGCGG AGCTCTGTGC CCACCACTAG CATTCCTGGG 

4 051 ACCCCCACAG TGGACCTGGG AACATCTGGG ACTCCAGTTT CTAAACCTGG 
4101 TCCCTCGGCT GCCAGCCCTC TCCTGGTGCT ATTCACTCTC AACTTCACCA 
4151 TCACCAACCT GCGGTATGAG GAGAACATGC AGCACCCTGG CTCCAGGAAG 

42 01 TTCAACACCA CGGAGAGGGT CCTTCAGGGC CTGCTCAGGT CCCTGTTCAA 
4251 GAGCACCAGT GTTGGCCCTC TGTACTCTGG CTGCAGACTG ACTTTGCTCA 

43 01 GGCCTGAAAA GGATGGGACA GCCACTGGAG TGGATGCCAT CTGCACCCAC 

43 51 CACCCTGACC CCAAAAGCCC TAGGCTGGAC AGAGAGCAGC TGTATTGGGA 

44 01 GCTGAGCCAG CTGACCCACA ATATCACTGA GCTGGGCCAC TATGCCCTGG 
4451 ACAACGACAG CCTCTTTGTC AATGGTTTCA CTCATCGGAG CTCTGTGTCC 
4501 ACCACCAGCA CTCCTGGGAC CCCCACAGTG TATCTGGGAG CATCTAAGAC 
4 551 TCCAGCCTCG ATATTTGGCC CTTCAGCTGC CAGCCATCTC CTGATACTAT 
4601 TCACCCTCAA CTTCACCATC ACTAACCTGC GGTATGAGGA GAACATGTGG 
4 651 CCTGGCTCCA GGAAGTTCAA CACTACAGAG AGGGTCCTTC AGGGCCTGCT 
4701 AAGGCCCTTG TTCAAGAACA CCAGTGTTGG CCCTCTGTAC TCTGGCTCCA 
4751 GGCTGACCTT GCTCAGGCCA GAGAAAGATG GGGAAGCCAC CGGAGTGGAT 
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TABLE 8 -continued 



Complete DNA Sequence for 13 Repeats including the Carboxy Terminus of CA125 
5 (SEQ ID NO: 49) 

4 801 GCCATCTGCA CCCACCGCCC TGACCCCACA GGCCCTGGGC TGGACAGAGA 

10 4 851 GCAGCTGTAT TTGGAGCTGA GCCAGCTGAC CCACAGCATC ACTGAGCTGG 

4901 GCCCCTACAC ACTGGACAGG GACAGTCTCT ATGTCAATGG TTTCACCCAT 

4 951 CGGAGCTCTG TACCCACCAC CAGCACCGGG GTGGTCAGCG AGGAGCCATT 

15 

5001 CACACTGAAC TTCACCATCA ACAACCTGCG CTACATGGCG GACATGGGCC 

5 051 AACCCGGCTC CCTCAAGTTC AACATCACAG ACAACGTCAT GAAGCACCTG 
20 5101 CTCAGTCCTT TGTTCCAGAG GAGCAGCCTG GGTGCACGGT ACACAGGCTG 

PI 5151 CAGGGTCATC GCACTAAGGT CTGTGAAGAA CGGTGCTGAG ACACGGGTGG 

"2 52 01 ACCTCCTCTG CACCTACCTG CAGCCCCTCA GCGGCCCAGG TCTGCCTATC 

^[^ 5251 AAGCAGGTGT TCCATGAGCT GAGCCAGCAG ACCCATGGCA TCACCCGGCT 

53 01 GGGCCCCTAC TCTCTGGACA AAGACAGCCT CTACCTTAAC GGTTACAATG 
3i 5351 AACCTGGTCT AGATGAGCCT CCTACAACTC CCAAGCCAGC CACCACATTC 

54 01 CTGCCTCCTC TGTCAGAAGC CACAACAGCC ATGGGGTACC ACCTGAAGAC 
S^^'S 54 51 CCTCACACTC AACTTCACCA TCTCCAATCT CCAGTATTCA CCAGATATGG 

'! 55 01 GCAAGGGCTC AGCTACATTC AACTCCACCG AGGGGGTCCT TCAGCACCTG 

1''^ 5551 CTCAGACCCT TGTTCCAGAA GAGCAGCATG GGCCCCTTCT ACTTGGGTTG 

40 56 01 CCAACTGATC TCCCTCAGGC CTGAGAAGGA TGGGGCAGCC ACTGGTGTGG 

5651 ACACCACCTG CACCTACCAC CCTGACCCTG TGGGCCCCGG GCTGGACATA 

5701 CAGCAGCTTT ACTGGGAGCT GAGTCAGCTG ACCCATGGTG TCACCCAACT 

45 

5 751 GGGCTTCTAT GTCCTGGACA GGGATAGCCT CTTCATCAAT GGCTATGCAC 

5 801 CCCAGAATTT ATCAATCCGG GGCGAGTACC AGATAAATTT CCACATTGTC 

50 5 851 AACTGGAACC TCAGTAATCC AGACCCCACA TCCTCAGAGT ACATCACCCT 

5 901 GCTGAGGGAC ATCCAGGACA AGGTCACCAC ACTCTACAAA GGCAGTCAAC 

5951 TACATGACAC ATTCCGCTTC TGCCTGGTCA CCAACTTGAC GATGGACTCC 

55 
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TABLE 8 -continued 



Complete DNA Sequence for 13 Repeats including the Carboxy Terminus of CA12 5 
5 (SEQ ID NO: 49) 

6001 GTGTTGGTCA CTGTCAAGGC ATTGTTCTCC TCCAATTTGG ACCCCAGCCT 

10 6051 GGTGGAGCAA GTCTTTCTAG ATAAGACCCT GAATGCCTCA TTCCATTGGC 

6101 TGGGCTCCAC CTACCAGTTG GTGGACATCC ATGTGACAGA AATGGAGTCA 

6151 TCAGTTTATC AACCAACAAG CAGCTCCAGC ACCCAGCACT TCTACCCGAA 

15 

62 01 TTTCACCATC ACCAACCTAC CATATTCCCA GGACAAAGCC CAGCCAGGCA 

62 51 CCACCAATTA CCAGAGGAAC AAAAGGAATA TTGAGGATGC GCTCAACCAA 

20 63 01 CTCTTCCGAA ACAGCAGCAT CAAGAGTTAT TTTTCTGACT GTCAAGTTTC 

6351 AACATTCAGG TCTGTCCCCA ACAGGCACCA CACCGGGGTG GACTCCCTGT 

^ 6401 GTAACTTCTC GCCACTGGCT CGGAGAGTAG ACAGAGTTGC CATCTATGAG 

1^ 64 51 GAATTTCTGC GGATGACCCG GAATGGTACC CAGCTGCAGA ACTTCACCCT 

J i 

65 01 GGACAGGAGC AGTGTCCTTG TGGATGGGTA TTCTCCCAAC AGAAATGAGC 

3l|j 6551 CCTTAACTGG GAATTCTGAC CTTCCCTTCT GGGCTGTCAT CTTCATCGGC 

1, 6601 TTGGCAGGAC TCCTGGGACT CATCACATGC CTGATCTGCG GTGTCCTGGT 

6651 GACCACCCGC CGGCGGAAGA AGGAAGGAGA ATACAACGTC CAGCAACAGT 

M 

''4 67 01 GCCCAGGCTA CTACCAGTCA CACCTAGACC TGGAGGATCT GCAATGACTG 

6751 GAACTTGCCG GTGCCTGGGG TGCCTTTCCC CCAGCCAGGG TCCAAAGAAG 

40 68 01 CTTGGCTGGG GCAGAAATAA ACCATATTGG TCG 
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TABLE 9 



Complete Amino Acid Sequence for 13 Repeats Contiguous with the Carboxy Terminus 

of CA125 (SEQ ID NO: 50) 



1 

ERVLQGLLKP LFRNSSLEYL YSG CRLASLR PEKDSSAMAV DAIC THRPDP 

EDLGLDRERL YWELSNLTNG IQELGPYTLD RNSLYVNGFT HRSSMPTTST 

PGTSTVDVGT SGTPSSSPSP TTAGPLLMPF TLNFTITNLQ YEEDMRRTGS 

2 

RKFNTMERVL QGPLSPIFKN SSVGPLYSG C RLTSLRPEKD GAATGMDAVC 

LYHPNPKRPG LDREQLYWEL SQLTHNITEL GPYSLDRDSL YVNGFTHQNS 

VPTTSTPGTS TVYWATTGTP SSFPGHTEPG PLLIPFTLNF TITNLQYEEN 

3 

MGHPGSRKFM ITERVLQGLL NPIFKNSSVG PLYSG CRLTS LRPEKDGAAT 

GMDAVC LYHP NPKRPGLDRE QLYCELSQLT HMITELGPYS LDRDSLYVNG 

FTHQNSVPTT STPGTSTVYW ATTGTPSSFP GHTEPGPLLI PFTLNFTITM 

4 

LQYEEDMRRT GSRKFNTMER VLQGLLKPLF KSTSVGPLYS GCRLTLLRPE 

KHGAATGVDA IC TLRLDPTG PGLDRERLYW ELSQLTNSVT ELGPYTLDRD 

SLYVNGFTHR SSVPTTSIPG TSAVHLETSG TPASLPGHTA PGPLLVPFTL 

NFTITNLQYE EDMRHPGSRK FNTTERVLQG LLKPLFKSTS VGPLYSG CRL 
5 

TLLRPEKRGA ATGVDTIC TH RLDPLNPGLD REQLYWELSK LTRGIIELGP 

YLLDRGSLYV NGFTHRNFVP ITSTPGTSTV HLGTSETPSS LPRPIVPGPL 

LIPFTLNFTI TNLQYEENMG HPGSRKFNIT ERVLQGLLKP LFRNSSLEYL 
6 

YSGCRLASLR PEKDSSAMAV DAICTHRPDP EDLGLDRERL YWELSNLTNG 



IQELGPYTLD RNSLYVNGFT HRSSMPTTST PGTSTVDVGT SGTPSSSPSP 

TTAGPLLMPF TLNFTITNLQ YEEDMRRTGS RKFNTMESVL QGLLKPLFKN 

7 

TSVGPLYSGC RLTLLRPKKD GAATGVDAIC THRLDPKSPG LNREQLYWEL 



SKLTNDIEEV GPYTLDRNSL YVNGFTHRSF VAPTSTLGTS TVDLGTSGTP 

SSLPSPTTGV PLLIPFTLNF TITNLQYEEN MGHPGSRKFN IMERVLQGLL 

8 

SPIFKNSSVG SLYSGCRLTL LRPEKDGAAT RVDAVCTHRP DPKSPGLDRE 



RLYWKLSQLT HGIIELGPYT LDRHSFYVNG FTHQSSMTTT RTPDTSTMHL 
ATSRTPASLS GPTTASPLLV LFTINFTITN QRYEENMHHP GSRKFNTTER 
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TABLE 9 -continued 



Complete Amino Acid Sequence for 13 Repeats Contiguous with the 

of CA125 (SEQ ID NO: 50) 



9 

VLQGLLRPVF KNTSVGPLYS G CRLTLLRPK KDGAATKVDA IC TYRPDPKS 

PGLDREQLYW ELSQLTHSIT ELGPYTQDRD SLYVNGFTHR SSVPTTSIPG 

TSAVHLETSG TPASLPGPSA ASPLLVLFTL NFTITNLRYE ENMQHPGSRK 

10 

FNTTERVLQG LLRSLFKSTS VGPLYSG CRL TLLRPEKDGT ATGVDAIC TH 

HPDPKSPRLD REQLYWELSQ LTHNITELGH YALDNDSLFV NGFTHRSSVS 

TTSTPGTPTV YLGASKTPAS IFGPSAASHL LILFTLNFTI TNLRYEENMW 

11 

PGSRKFNTTE RVLQGLLRPL FKNTSVGPLY SG SRLTLLRP EKDGEATGVD 
AICTHRPDPT GPGLDREQLY LELSQLTHSI TELGPYTLDR DSLYVNGFTH 



RSSVPTTSTG WSEEPFTLN FTINNLRYMA DMGQPGSLKF NITDNVMKHL 

12 

LSPLFQRSSL GARYTG CRVI ALRSVKNGAE TRVDLLC TYL QPLSGPGLPI 

KQVFHELSQQ THGITRLGPY SLDKDSLYLN GYNEPGLDEP PTTPKPATTF 

LPPLSEATTA MGYHLKTLTL NFTISNLQYS PDMGKGSATF NSTEGVLQHL 

13 

LRPLFQKSSM GPFYLG CQLI SLRPEKDGAA TGVDTTC TYH PDPVGPGLDI 
QQLYWELSQL THGVTQLGFY VLDRDSLFIN GYAPQNLSIR GEYQINFHIV 
NWNLSNPDPT SSEYITLLRD IQDKVTTLYK GSQLHDTFRF CLVTNLTMDS 
VLVTVKALFS SNLDPSLVEQ VFLDKTLNAS FHWLGSTYQL VDIHVTEMES 
SVYQPTSSSS TQHFYLNFTI TNLPYSQDKA QPGTTNYQRN KRNIEDALNQ 
LFRNSSIKSY FSDCQVSTFR SVPNRHHTGV DSLCNFSPLA RRVDRVAIYE 
EFLRMTRNGT QLQNFTLDRS SVLTOGYSPN RNEPLTGNSD LPFWAVILIG 
LAGLLGLITC LICGVLVTTR RRKKEGEYNV QQQCPGYYQS HLDLEDLQ 
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TABLE lOA 



5' Primer Sequence for End of the Open Reading Frame for Contig #32 of Chromosome 
5 19 Cosmid AC008734 (SEQ ID NO: 51), Primer Sequence from within the Repeat Region 
(SEQ ID NO: 52, 3 Primer Sets Synthesized to Piece Together Entire Open Reading 
Frame in Contig #32 (SEQ ID NOS: 53 thru 58), Primers to Cosmid No. AC008734 for 
Contig #32 (SEQ ID NOS: 59 and 60), Sense Primer Sequence (supplied by Ambion) 
(SEQ ID NO: 61), Anti-Sense Primer Sequence for CA125 (SEQ ID NO: 62), and 
10 5 'Sense Primer Sequence (from Ambion) (SEQ ID NO: 63) and Anti-Sense Primer 

Specific to CA125 (SEQ ID NO: 64) 



15 



35 



(SEQ ID NO: 51) (5 '-CAGCAGAGACC AGCACGAGTACTC-3 ') 

(SEQ ID NO: 52) (5 '-TCCACTGCCATGGCTGAGCT-3 ') 

Primer Sets 

20 (SEQ ID NO: 53) (Set 1 ) 5 '-CC AGCAC AGCTCTTCCCAGGAC-3 ' 
(SEQ ID NO: 54) 5'-GGAATGGCTGAGCTGACGTCTG-3') 



(SEQ ID NO: 55) (Set 2) 5'-CTTCCCAGGACAACCTCAAGG-3' 
(SEQ ID NO: 56 5 '-GCAGGATGAGTGAGCCACGTG-3 ' 

(SEQ ID NO: 57) (Set 3) 5'-GTCAGATCTGGTGACCTCACTG-3 ' 
(SEQ ID NO: 58) 5 '-GAGGCACTGGAAAGCCCAGAG-3 ' 



i (SEQ ID NO: 59) 5'-CTGATGGCATTATGGAACACATCAC-3' 

3i (SEQ ID NO: 60) 5 '-CCCAGAACGAGAGACCAGTGAG-3 ' 

[^t (SEQ ID NO: 61) 5 '-GCT(jATGGCGATGAATGAACACTG-3 ' 

■AW, 

[i (SEQ ID NO: 62) 5 '-CCCAGAACGAGAGACCAGTGAG-3 ' 



(SEQ ID NO: 63) 5'-CGCGGATCCGAACACTGCGTTTGCTGGCTTTGATG-3 ' 
(SEQ ID NO: 64) 5 '-CCTCTGTGTGCTGCTTCATTGGG-3 ' 
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TABLE lOB 



Sense and Anti- Sense Primers Used to Order the CA125 Carboxy Terminal Domain 
(SEQ. ID NO: 303 and SEQ ID NO: 304, respectively) 

(SEQ ID NO: 303) 5'-GGACAAGGTCACCACACTCTAC-3' 
(SEQ ID NO: 304) 5 '-GCAGATCCTCCAGGTCTAGGTGTG-3 ' 



TABLE IOC 



Sense and Anti -Sense Primers Used to Amplify Overlapping Sequences 

in the Repeat Domain 
(SEQ ID NO: 305 and SEQ ID NO: 306, respectively) 

(SEQ ID NO: 305) 5 ' GTC TCT ATG TCA ATG GTT TCA CCC-3 ' 
(SEQ ID NO: 306) 5 '-TAG CTG CTC TCT GTC CAG TCC-3 ' 
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TABLE 11 



5 ' Sense Primer 1 Sequence and 3 ' Antisense Primer 2 
5 (SEQ ID NO: 65 and SEQ ID NO: 66, respectively), and 

Nucleotide and Amino Acid Sequences of the CA125 Repeat Expressed in E. coli 
(SEQ ID NO: 67 and SEQ ID NO: 68, respectively) 

10 (SEQ ID NO: 65) 5'-ACCGGATCCATGGGCCACACAGAGCCTGGCCC-3' 
(SEQ ID NO: 66) 5'-TGTAAGCTTAGGCAGGGAGGATGGAGTCC-3' 
(SEQ ID NO: 67) 

15 

1 ATGAGAGGAT CGCATCACCA TCACCATCAC GGATCCATGG GCCACACAGA 

t 

51 GCCTGGCCCT CTCCTGATAC CATTCACTTT CAACTTTACC ATCACCAACC 
20 101 TGCATTATGA GGAAAACATG CAACACCCTG GTTCCAGGAA GTTCAACACC 
%D 151 ACGGAGAGGG TTCTGCAGGG TCTGCTCAAG CCCTTGTTCA AGAACACCAG 
2 01 TGTTGGCCCT CTGTACTCTG GCTGCAGACT GACCTTGCTC AGACCTGAGA 
251 AGCATGAGGC AGCCACTGGA GTGGACACCA TCTGTACCCA CCGCGTTGAT 
301 CCCATCGGAC CTGGACTGGA CAGAGAGCGG CTATACTGGG AGCTGAGCCA 
351 GCTGACCAAC AGCATCACAG AGCTGGGACC CTACACCCTG GACAGGGACA 
401 GTCTCTATGT CAATGGCTTC AACCCTCGGA GCTCTGTGCC AACCACCAGC 
451 ACTCCTGGGA CCTCCACAGT GCACCTGGCA ACCTCTGGGA CTCCATCCTC 
501 CCTGCCT 
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35 



40 



(SEQ ID NO: 68) 
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TABLE 12 



Additional Multiple Repeat Amino Acid Sequences 
(SEQ ID NO: 69 thru SEQ ID NO: 80) 



10 



20 



40 



(SEQ ID NO: 69) 



ERVLQGLLGP MFKNTSVGLL YSG CRLTLLR PKKDGAATKV DAIC TYRPDP 
KSPGLDREQL YWELSQLTHS ITELGPYTLD RDSLYVNGFT QRSSVPTTSI 
15 PGTPTVDLGT SGTPVSKPGP SAASPLLIPF TINFTITNLR YEENMGHPGS 

RKFNIMERVL QGLLKPLFKN TSVGPLYSG C RLTLLRPKKD GAATGVDAIC 
THRLDPKSPG LNREQLYWEL SKLTNDIEEL GPYTLDRNSL YVNGFTHQSS 
VSTTSTPGTS TVDLRTSGTP SSLSSPTIMA AGPLLIPFTI NFTITNLRYE 
ENMHHPGSRK FNTMERVLQG LLMPLFKNTS VSSLYSG CRL TLLRPEKDQA 
ATRVDAVCTH RPDPKSPGLD RERLYWKLSQ LTHGITELGP YTLDRNSLYV 



NGFTHRSSMP TTSTPGTSTV DVGTSGTPSS SPSPTTAGPL LMPFTLNFTI 
TNLQYEEDMR RTGSRKFNTM ERVLQGLLKP LFKSTSVGPL YSG CRLTLLR 
PEKHGAATGV DAIC TLRLDP TGPGLDRERL YWELSQLTNS VTELGPYTLD 
RDSLYVNGFT HRSSVPTTSI PGTSAVHLET SGTPASLPGH TAPGPLLIPF 
TLNFTITNLH YEENMQHPGS RKFNTMERVL QGCLVPCSRN TNVGLLYSGC 
RLTLLRXEKX XAATXVDXXC XXXXDPXXPG LDREXLYWEL SXLTXXIXEL 
GPYTLDRNSL YVNGFTHRSS VAPTSTPGTS TVDLGTSGTP SSLPSPTTVP 
LLVPFTLNFT ITNLQYGEDM RHPGSRKFNT TERVLQGLLG PLFKNSSVGP 
LYSG CRLISL RSEKDGAATG VDAIC THHLN PQSPGLDREQ LYWQLSQVTN 
45 GIKELGPYTL DRNSLYVNGF THRSSGLTTS TPWTSTVDLG TSGTPSPVPS 

PTTAGPLLI 

50 
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TABLE 12 -continued 



Additional Multiple Repeat Amino Acid Sequences 
(SEQ ID NO: 6 9 through SEQ ID NO: 80) 



(SEQ ID NO: 70) 

10 QGLLGPMFKN TSVGLLYSG C RLTLLRPEKR GAATGVDTIC THRLDPLNPG 

LDREQLYWEL SKLTRGIIEL GPYLLDRGSL YVNGFTHRNF VPITSTPGTS 
TVHLGTSETP SSLPRPIVPG PLLVPFTLNF TITNLQYEEA MRHPGSRKFN 

15 

TTERVLQGLL RPLFKNTSVS SLYSG CRLTL LRPEKDGAAT RVDAAC TYRP 
DPKSPGLDRE QLYWELSQLT HSITELGPYT LDRVSLYVNG FNPRSSVPTT 
20 STPGTSTVHL ATSGTPSSLP GHTAPVPLLI PFTLNFTITN LQYEEDMRHP 

GSRKFNTMER VLQGLLRPLF KNTSIGPLYS S CRLTLLRPE KDKAATRVDA 
\Q ICTHHPDPQS PGLNREQLYW ELSQLTHGIT ELGPYTLDRD SLYVDGFTHW 

[S SPIPTTSTPG TSIVNLGTSG IPPSLPETTA TGPLLIPFTP NFTITNLQYE 

; 11 

U EDMRRTGSRK FNTMERVLQG LLSPIFKNSS VGPLYSG CRL TSLRPEKDGA 

3|| ATGMDAVCLY HPNPKRPGLD REQLY 

O (SEQ ID NO: 71) 

3fgj ERVLQGLLKP LFKSTSVGPL YSG CRLTLLR PEKDGVATRV DAIC THRPDP 

KIPGLDRQQL YWELSQLTHS ITELGPYTLD RDSLYVNGFT QRSSVPTTST 
PGTFTVQPET SETPSSLPGP TATGPVLLPF TLNFTIINLQ YEEDMHRPGS 

40 

RKFNTTERVL QGLLMPLFKN TSVGPLYSG C RLTLLRPEKQ EAATGVDTIC 
THRLDPSEPG LDREQLYWEL SQLTNSITEL GPYTLDRDSL YVNGFTHSGV 
45 LCPPPSILGI FTVQPETFET PSSLPGPTAT GPVLLPFTLN FTIINLQYEE 

DMHRPGSRKF NTTERVLQGL LTPLFKNTSV GPLYSG CRLT LLRPEKQEAA 
TGVDTICTHR VDPIGPGLDR ERLYWELSQL TNSITELGPY TLDRDSLYVN 



50 



GFNPWSSVPT TSTPGTSTVH LATSGTPSSL PGHTAPVPLL IPFTLNFTIT 
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TABLE 12 -continued 



Additional Multiple Repeat Amino Acid Sequences 
(SEQ ID NO: 69 through SEQ ID NO: 80) 



NLHYEENMQH PGSRKFNTTE RVLQGLLKPL FKSTSVGPLY SG CRLTLLRP 

10 EKHGAATGVD AIC THRLDPK SPGVDREQLY WELSQLTNGI KELGPYTLDR 

NSLYVNGFTH WIPVPTSSTP GTSTVDLGSG TPSSLPSPTT AGPL 
(SEQ ID NO: 72) 

15 

TSVGPLYSG C RLTLLRSEKD GAATGVDAIY THRLDPKSPG VDREQLYWEL 

SQLTNGIKEL GPYTLDRNSL YVNGFTHQTS APNTSTPGTS TVDLGTSGTP 

20 SSLPSPTSAG PLLIPFTINF TITNLRYEEN MHHPGSRKFN TMERVLQGLL 

KPLFKSTSVG PLYSG CRLTL LRPEKDGVAT RVDAIC THRP DPKIPGLDRQ 

QLYWELSQLT HSITELGPYT LDRDSLYVNG FTQRSSVPTT STPGTFTVQP 

ETSETPSSLP GPTATGPVLL PFTLNFTIIN LQYEEDMHRP GSRKFNTTER 
VLQGLLKPLF KSTSVGPLYS G CRLTLLRPE KHGAATGVDA IC TLRLDPTG 
PGLDRERLYW ELSQLTNSIT ELGPYTLDRD SLYVNGFNPW SSVPTTSTPG 
TSTVHLATSG TPSSLPGHTA PVPL 



^1 (SEQ ID NO: 73) 

ERVLQGLLKP LFKSTSVGPL YSG CRLTLLR PEKRGAATGV DTIC THRLDP 



40 



LNPGLDREQL YWELSKLTRG IIELGPYLLD RDSLYVNGFT HRSSVPTTSI 

PGTSAVHLET SGTPASLPGH TAPGPLLVPF TLNFTITNLQ YEEDMRHPGS 

RKFNTTERVL QGLLKPLFKS TSVGPLYSG C RLTLLRPEKR GAATGVDTIC 

45 THRLDPLNPG LDREQLYWEL SKLTRGIIEL GPYLLDRGSL YVNGFTHRNF 

VPITSTPGTS TVHLGTSETP SSLPRPIVPG PLLIPF 

50 
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TABLE 12 -continued 



Additional Multiple Repeat Amino Acid Sequences 
(SEQ ID NO: 6 9 through SEQ ID NO: 80) 

(SEQ ID NO: 74) 

ERVLQGLLRP VFKNTSVGPL YSG CRLTLLR PKKDGAATKV DAIC TYRPDP 
KSPGLDREQL YWELSQLTHS ITELGPYTLD RDSLYVNGFT QRSSVPTTSI 
PGTPTVDLGT SGTPVSKPGP SAASPLLVPF TLNFTITNLQ YEEDMHRPGS 
RKFNATERVL QGLLSPIFKN SSVGPLYSG C RLTSLRPEKD GAATGMDAVC 
LYHPNPKRPG LDREQLYWEL SQLTHNITEL GPYSLDRDSL YVNGFTHQSS 
MTTTRTPDTS TMHLATSRTP ASLSGPTTAS PLLIPF 

(SEQ ID NO: 75) 

ERVLQGLLKP LFKSTSVGPL YSG CRLTLLR PEKRGAATGV DTIC THRLDP 
LNPGLDREQL YWELSKLTRG IIELGPYLLD RGSLYVNGFS RQSSMTTTRT 
PDTSTMHLAT SRTPASLSGP TTASPLLIPF TLNFTITNLQ YEENMGHPGS 
RKFNIMERVL QGLLNPIFKN SSVGPLYSG C RLTSLKPEKD GAATGMDAVC 
LYHPNPKRPG LDREQLYWEL SQLTHGIKEL GPYTLDRNSL YVNGFTHRSS 
VAPTSTPGTS TVDLGTSGTP SSLPSPTTAV PLLIPF 

(SEQ ID NO: 76) 

ERVLQGLLKP LFRNSSLEYL YSG CRLASLR PEKDSSAMAV DAIC THRPDP 
EDLGLDRERL YWELSNLTNG IQELGPYTLD RNSLYVNGFT HRSSGLTTST 
PWTSTVDLGT SGTPSPVPSP TTAGPLLIPF TLNFTITNLQ YEENMGHPGS 
RKFNIMERVL QGLLMPLFKN TSVSSLYSG C RLTLLRPEKD GAATRVDAVC 
TQRPDPKSPG LDRERLYWKL SQLTHGITEL GPYTLDRHSL YVNGLTHQSS 
MTTTRTPDTS TMHLATSRTP ASLSGPTTAS PLLIPF 



58 



TABLE 12 -continued 



Additional Multiple Repeat Amino Acid Sequences 
(SEQ ID NO: 6 9 through SEQ ID NO: 80) 



(SEQ ID NO: 77) 

ERVLQGLLSP ISKNSSVGPL YSG CRLTSLR PEKDGAATGM DAVC LYHPNP 
KRPGLDREQL YWELSQLTHN ITELGPYSLD RDSLYVNGFT HQNSVPTTST 
PGTSTVYWAT TGTPSSFPGH TEPGPLLIPF TVNFTITNLR YEENMHHPGS 
RKFNTTERVL QGLLRPVFKN TSVGPLYSG C RLTLLRPKKD GAATKVDAIC 
TYRPDPKSPG LDREQLYWEL SKLTNDIEEL GPYTLDRNSL YVNGFTHQSS 
VSTTSTPGTS TVDLRTSGTP SSLSSPTIMA AGPLLIPF 

(SEQ ID NO: 78) 

ERVLHGLLTP LFKNTRVGPL YSG CRLTLLR PEKQEAATGV DTIC THRVDP 
IGPGLDRERL YWELSQLTNS ITELGPYTLD RDSLYWGFN PWSSVPTTST 
PGTSTVHLAT SGTPSSLPGH TAPVPLLIPF TLNFTITNLH YEENMQHPGS 
RKFNTTERVL QGLLKPLFKN TSVGPLYSG C RLTLFKPEKH EAATGVDAIC 
TLRLDPTGPG LDRQLYWELS QLTNSVTELG PYTLDRDSLY VNGFTHRSSV 
PTTSIPGTSA VHLETSGTPA SLPGHTAPGP LLIPFTLNFT ITNLQYEEDM 
RRTGSRKFNT MERVLQGLLK PLFKSTSVGP LYSG CRLTLL RPEKRGAATG 
VDTIC THRLD PLNPGLDREQ LYWELSKLTR GIIELGPYLL DRGSLYVNGF 
THRNFVPITS TPGTSTVHLG TSETPSSLPR PIVPGPLLIP FTINFTITNL 
RYEENMHHPG SRKFNIMERV LQGLLGPLFK NSSVGPLYSG CRLISLRSEK 
DGAATGVDAI C THHLNPQSP GLDREQLYWQ LSQMTNGIKE LGPYTLDRNS 
LYVNGFTHRS SGLTTSTPWT STVDLGTSGT PSPVPSPTTA GPLLIPF 
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TABLE 12 -continued 



Additional Multiple Repeat Amino Acid Sequences 
(SEQ ID NO: 6 9 through SEQ ID NO: 80) 



(SEQ ID NO: 79) 

10 GPLYSG CRLT SLRPEKDGAA TGMDAVC LYH PNPKRPGLDR EQLYWELSQL 

THNITELGPY SLDRDSLYVN GFTHQNSVPT TSTPGTSTVY WATTGTPSSF 
PGHTEPGPLL IPFTLNFTIT NLQYEENMGH PGSRKFNITE SVLQGLLTPL 

15 

FKMSSVGPLY SG CRLISLRS EKDGAATGVD AIC THHLNPQ SPGLDREQLY 
WQLSQMTNGI KELGPYTLDR DSLYVNGFTH RSLGLTTSTP WTSTVDLGTS 
20 GTPSPVPSPT TAGPLLIPFT LNFTITNLQY EENMGHPGSR KFNIMERVLQ 

GLLRPVFKNT SVGPLYSG CR LTLLRPKKDG AATKVDAIC T YRPDPKSPGL 
DREQLYWELS QLTHSITELG PYTLDRDSLY VNGFTQRSSV PTTSIPGTPT 
VDLGTSGTPV SKPGPSAASP 



'4 (SEQ ID NO: 80) 

m 

fn QLYWELSKLT NDIEELGPYT LDRNSLYWG FTHQSSVSTT STPGTSTVDL 

1. RTSGTPSSLS SPTIMAAGPL LIPFTLNFTI TNLQYEENMG HPGSRKFNIM 

y 

3^ ERVLQGLLGP MFKNTSVGLL YSG CRLTLLR PEKNGAATGM DAIC SHRLDP 

ru 

H KSPGLNREQL YWELSQLTHG IKELGPYTLD RNSLYVNGFT HRSSVAPTST 

U PGTSTVDLGT SGTPSSLPSP TTAVPLLIPF TLNFTITNLK YEEDMHCPGS 

40 

RKFNTTERVL QSLFGPMFKN TSVGPLYSG C RLTLLRSEKD GAATGVDAIC 
THRLDPKSLG VDREQLYWEL SQLTNGIKEL GPYTLDRNSL YVNGFTHQTS 
45 APNTSTPGTS TVDLGTSGTP SSLPSPTSAG PLLVPFTLNF TITNLQYEED 

MRRTGSRKFN TMESVLQGLL KPLFKNTSVG PLYSG CRLTL LRPEKDGAAT 
GVDAIC THRL DPKSPGLNRE QLYWELSKL 

50 
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TABLE 13 



251 



Amino Terminal Nucleotide Sequence 
(SEQ ID NO: 81) 



1 CAGAGAGCGT TGAGCTGGGA ACAGTGACAA GTGCTTATCA AGTTCCTTCA 

10 51 CTCTCAACAC GGTTGACAAG AACTGATGGC ATTATGGAAC ACATCACAAA 

101 AATACCCAAT GAAGCAGCAC ACAGAGGTAC CATAAGACCA GTCAAAGGCC 

151 CTCAGACATC CACTTCGCCT GCCAGTCCTA AAGGACTACA CACAGGAGGG 

15 

2 01 ACAAAAAGAA TGGAGACCAC CACCACAGCT TTGAAGACCA CCACCACAGC 
251 TTTGAAGACC ACTTCCAGAG CCACCTTGAC CACCAGTGTC TATACTCCCA 

3 01 CTTTGGGAAC ACTGACTCCC CTCAATGCAT CAAGGCAAAT GGCCAGCACA 
;t{ 351 ATCCTCACAG AAATGATGAT CACAACCCCA TATGTTTTCC CTGATGTTCC 
ifi 4 01 AGAAACGACA TCCTCATTGG CTACCAGCCT GGGAGCAGAA ACCAGCACAG 

451 CTCTTCCCAG GACAACCCCA TCTGTTCTCA ATAGAGAATC AGAGACCACA 

501 GCCTCACTGG TCTCTCGTTC TGGGGCAGAG AGAAGTCCGG TTATTCAAAC 

3§ 551 TCTAGATGTT TCTTCTAGTG AGCCAGATAC AACAGCTTCA TGGGTTATCC 

!;j 601 ATCCTGCAGA GACCATCCCA ACTGTTTCCA AGACAACCCC CAATTTTTTC 

651 CACAGTGAAT TAGACACTGT ATCTTCCACA GCCACCAGTC ATGGGGCAGA 

35 

701 CGTCAGCTCA GCCATTCCAA CAAATATCTC ACCTAGTGAA CTAGATGCAC 

751 TGACCCCACT GGTCACTATT TCGGGGACAG ATACTAGTAC AACATTCCCA 

40 801 ACACTGACTA AGTCCCCACA TGAAACAGAG ACAAGAACCA CATGGCTCAC 

851 TCATCCTGCA GAGACCAGCT CAACTATTCC CAGAACAATC CCCAATTTTT 

901 CTCATCATGA ATCAGATGCC ACACCTTCAA TAGCCACCAG TCCTGGGGCA 

951 GAAACCAGTT CAGCTATTCC AATTATGACT GTCTCACCTG GTGCAGAAGA 



45 
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TABLE 13 -continued 



Amino Terminal Nucleotide Sequence 
5 (SEQ ID NO: 81) 





1001 


TCTGGTGACC 


TCACAGGTCA 


CTAGTTCTGG 


GACAGACAGA 


AATATGACTA 


10 


1051 


TTCCAACTTT 


GACTCTTTCT 


CCTGGTGAAC 


CAAAGACGAT 


AGCCTCATTA 




1101 


GTCACCCATC 


CTGAAGCACA 


GACAAGTTCG 


GCCATTCCAA 


CTTCAACTAT 


15 


1151 


CTCGCCTGCT 


GTATCACGGT 


TGGTGACCTC 


AATGGTCACC 


AGTTTGGCGG 


1201 


CAAAGACAAG 


TACAACTAAT 


CGAGCTCTGA 


CAAACTCCCC 


TGGTGAACCA 




1251 


GCTACAACAG 


TTTCATTGGT 


CACGCATCCT 


GCACAGACCA 


GCCCAACAGT 


20 


1301 


TCCCTGGACA 


ACTTCCATTT 


TTTTCCATAG 


TAAATCAGAC 


ACCACACCTT 


lis? 


1351 


CAATGACCAC 


CAGTCATGGG 


GCAGAATCCA 


GTTCAGCTGT 


TCCAACTCCA 


Cfi 


1401 


ACTGTTTCAA 


CTGAGGTACC 


AGGAGTAGTG 


ACCCCTTTGG 


TCACCAGTTC 


Ui 


1451 


TAGGGCAGTG 


ATCAGTACAA 


CTATTCCAAT 


TCTGACTCTT 


TCTCCTGGTG 




1501 


AACCAGAGAC 


CACACCTTCA 


ATGGCCACCA 


GTCATGGGGA 


AGAAGCCAGT 


3i 


1551 


TCTGCTATTC 


CAACTCCAAC 


TGTTTCACCT 


GGGGTACCAG 


GAGTGGTGAC 




1601 


CTCTCTGGTC 


ACTAGTTCTA 


GGGCAGTGAC 


TAGTACAACT 


ATTCCAATTC 


35 


1651 


TGACTTTTTC 


TCTTGGTGAA 


CCAGAGACCA 


CACCTTCAAT 


GGCCACCAGT 


1701 


CATGGGACAG 


AAGCTGGCTC 


AGCTGTTCCA 


ACTGTTTTAC 


CTGAGGTACC 




1751 


AGGAATGGTG 


ACCTCTCTGG 


TTGCTAGTTC 


TAGGGCAGTA 


ACCAGTACAA 


40 


1801 


CTCTTCCAAC 


TCTGACTCTT 


TCTCCTGGTG 


AACCAGAGAC 


CACACCTTCA 




1851 


ATGGCCACCA 


GTCATGGGGC 


AGAAGCCAGC 


TCAACTGTTC 


CAACTGTTTC 


45 


1901 


ACCTGAGGTA 


CCAGGAGTGG 


TGACCTCTCT 


GGTCACTAGT 


TCTAGTGGAG 


1951 


TAAACAGTAC 


AAGTATTCCA 


ACTCTGATTC 


TTTCTCCTGG 


TGAACTAGAA 
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TABLE 13 -continued 



Amino Terminal Nucleotide Sequence 

5 (SEQ ID NO: 81) 





2001 


ACCACACCTT 


CAATGGCCAC 


CAGTCATGGG 


GCAGAAGCCA 


GCTCAGCTGT 


10 


2051 


TCCAACTCCA 


ACTGTTTCAC 


CTGGGGTATC 


AGGAGTGGTG 


ACCCCTCTGG 




2101 


TCACTAGTTC 


CAGGGCAGTG 


ACCAGTACAA 


CTATTCCAAT 


TCTAACTCTT 


15 


2151 


TCTTCTAGTG 


AGCCAGAGAC 


CACACCTTCA 


ATGGCCACCA 


GTCATGGGGT 


2201 


AGAAGCCAGC 


TCAGCTGTTC 


TAACTGTTTC 


ACCTGAGGTA 


CCAGGAATGG 




2251 


TGACCTCTCT 


GGTCACTAGT 


TCTAGAGCAG 


TAACCAGTAC 


AACTATTCCA 




2301 


ACTCTGACTA 


TTTCTTCTGA 


TGAACCAGAG 


ACCACAACTT 


CATTGGTCAC 




2351 


CCATTCTGAG 


GCAAAGATGA 


TTTCAGCCAT 


TCCAACTTTA 


GCTGTCTCCC 




2401 


CTACTGTACA 


AGGGCTGGTG 


ACTTCACTGG 


TCACTAGTTC 


TGGGTCAGAG 


251 

W 


2451 


ACCAGTGCGT 


TTTCAAATCT 


AACTGTTGCC 


TCAAGTCAAC 


CAGAGACCAT 




2501 


AGACTCATGG 


GTCGCTCATC 


CTGGGACAGA 


AGCAAGTTCT 


GTTGTTCCAA 




2551 


CTTTGACTGT 


CTCCACTGGT 


GAGCCGTTTA 


CAAATATCTC 


ATTGGTCACC 




2601 


CATCCTGCAG 


AGAGTAGCTC 


AACTCTTCCC 


AGGACAACCT 


CAAGGTTTTC 


35 


2651 


CCACAGTGAA 


TTAGACACTA 


TGCCTTCTAC 


AGTCACCAGT 


CCTGAGGCAG 


2701 


AATCCAGCTC 


AGCCATTTCA 


ACTACTATTT 


CACCTGGTAT 


ACCAGGTGTG 




2751 


CTGACATCAC 


TGGTCACTAG 


CTCTGGGAGA 


GACATCAGTG 


CAACTTTTCC 


40 


2801 


AACAGTGCCT 


GAGTCCCCAC 


ATGAATCAGA 


GGCAACAGCC 


TCATGGGTTA 




2851 


CTCATCCTGC 


AGTCACCAGC 


ACAACAGTTC 


CCAGGACAAC 


CCCTAATTAT 


45 


2901 


TCTCATAGTG 


AACCAGACAC 


CACACCATCA 


ATAGCCACCA 


GTCCTGGGGC 


2951 


AGAAGCCACT 


TCAGATTTTC 


CAACAATAAC 


TGTCTCACCT 


GATGTACCAG 
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TABLE 13 -continued 



Amino Terminal Nucleotide Sequence 
(SEQ ID NO: 81) 



3001 


ATATGGTAAC 


CTCACAGGTC 


ACTAGTTCTG 


GGACAGACAC 


CAGTATAACT 


3051 


ATTCCAACTC 


TGACTCTTTC 


TTCTGGTGAG 


CCAGAGACCA 


CAACCTCATT 


3101 


TATCACCTAT 


TCTGAGACAC 


ACACAAGTTC 


AGCCATTCCA 


ACTCTCCCTG 


3151 


TCTCCCCTGG 


TGCATCAAAG 


ATGCTGACCT 


CACTGGTCAT 


CAGTTCTGGG 


3201 


ACAGACAGCA 


CTACAACTTT 


CCCAACACTG 


ACGGAGACCC 


CATATGAACC 


3251 


AGAGACAACA 


GCCATACAGC 


TCATTCATCC 


TGCAGAGACC 


AACACAATGG 


3301 


TTCCCAAGAC 


AACTCCCAAG 


TTTTCCCATA 


GTAAGTCAGA 


CACCACACTC 


3351 


CCAGTAGCCA 


TCACCAGTCC 


TGGGCCAGAA 


GCCAGTTCAG 


CTGTTTCAAC 


3401 


GACAACTATC 


TCACCTGATA 


TGTCAGATCT 


GGTGACCTCA 


CTGGTCCCTA 


3451 


GTTCTGGGAC 


AGACACCAGT 


ACAACCTTCC 


CAACATTGAG 


TGAGACCCCA 


3501 


TATGAACCAG 


AGACTACAGT 


CACGTGGCTC 


ACTCATCCTG 


CAGAAACCAG 


3551 


CACAACGGTT 


TCTGGGACAA 


TTCCCAACTT 


TTCCCATAGG 


GGATCAGACA 


3601 


CTGCACCCTC 


AATGGTCACC 


AGTCCTGGAG 


TAGACACGAG 


GTCAGGTGTT 


3651 


CCAACTACAA 


CCATCCCACC 


CAGTATACCA 


GGGGTAGTGA 


CCTCACAGGT 


3701 


CACTAGTTCT 


GCAACAGACA 


CTAGTACAGC 


TATTCCAACT 


TTGACTCCTT 


3751 


CTCCTGGTGA 


ACCAGAGACC 


ACAGCCTCAT 


CAGCTACCCA 


TCCTGGGACA 


3801 


CAGACTGGCT 


TCACTGTTCC 


AATTCGGACT 


GTTCCCTCTA 


GTGAGCCAGA 


3851 


TACAATGGCT 


TCCTGGGTCA 


CTCATCCTCC 


ACAGACCAGC 


ACACCTGTTT 


3901 


CCAGAACAAC 


CTCCAGTTTT 


TCCCATAGTA 


GTCCAGATGC 


CACACCTGTA 


3951 


ATGGCCACCA 


GTCCTAGGAC 


AGAAGCCAGT 


TCAGCTGTAC 


TGACAACAAT 
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TABLE 13 -continued 



Amino Terminal Nucleotide Sequence 

5 (SEQ ID NO: 81) 





4001 


CTCACCTGGT 


GCACCAGAGA 


TGGTGACTTC 


ACAGATCACT 


AGTTCTGGGG 


10 


4051 


CAGCAACCAG 


TACAACTGTT 


CCAACTTTGA 


CTCATTCTCC 


TGGTATGCCA 




4101 


GAGACCACAG 


CCTTATTGAG 


CACCCATCCC 


AGAACAGGGA 


CAAGTAAAAC 


15 


4151 


ATTTCCTGCT 


TCAACTGTGT 


TTCCTCAAGT 


ATCAGAGACC 


ACAGCCTCAC 


4201 


TCACCATTAG 


ACCTGGTGCA 


GAGACTAGCA 


CAGCTCTCCC 


AACTCAGACA 




4251 


ACATCCTCTC 


TCTTCACCCT 


ACTTGTAACT 


GGAACCAGCA 


GAGTTGATCT 


m 


4301 


AAGTCCAACT 


GCTTCACCTG 


GTGTTTCTGC 


AAAAACAGCC 


CCACTTTCCA 


, 


4351 


CCCATCCAGG 


GACAGAGACC 


AGCACAATGA 


TTCCAACTTC 


AACTCTTTCC 


■ p: 


4401 


CTTGGTTTAC 


TAGAGACTAC 


AGGCTTACTG 


GCCACCAGCT 


CTTCAGCAGA 


251 


4451 


GACCAGCACG 


AGTACTCTAA 


CTCTGACTGT 


TTCCCCTGCT 


GTCTCTGGGC 




4501 


TTTCCAGTGC 


CTCTATAACA 


ACTGATAAGC 


CCCAAACTGT 


GACCTCCTGG 




4551 


AACACAGAAA 


CCTCACCATC 


TGTAACTTCA 


GTTGGACCCC 


CAGAATTTTC 




4601 


CAGGACTGTC 


ACAGGCACCA 


CTATGACCTT 


GATACCATCA 


GAGATGCCAA 


35 


4651 


CACCACCTAA 


AACCAGTCAT 


GGAGAAGGAG 


TGAGTCCAAC 


CACTATCTTG 


4701 


AGAACTACAA 


TGGTTGAAGC 


CACTAATTTA 


GCTACCACAG 


GTTCCAGTCC 




4751 


CACTGTGGCC 


AAGACAACAA 


CCACCTTCAA 


TACACTGGCT 


GGAAGCCTCT 


40 


4801 


TTACTCCTCT 


GACCACACCT 


GGGATGTCCA 


CCTTGGCCTC 


TGAGAGTGTG 




4851 


ACCTCAAGAA 


CAAGTTATAA 


CCATCGGTCC 


TGGATCTCCA 


CCACCAGCAG 


45 


4901 


TTATAACCGT 


CGGTACTGGA 


CCCCTGCCAC 


CAGCACTCCA 


GTGACTTCTA 


4951 


CATTCTCCCC 


AGGGATTTCC 


ACATCCTCCA 


TCCCCAGCTC 


CACAGCAGCC 
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TABLE 13 -continued 



Amino Terminal Nucleotide Sequence 
(SEQ ID NO: 81) 

50 01 ACAGTCCCAT TCATGGTGCC ATTCACCCTC AACTTCACCA TCACCAACCT 
5051 GCAGTACGAG GAGGACATGC GGCACCCTGG TTCCAGGAAG TTCAACGCCA 
5101 CAGAGAGAGA ACTGCAGGGT CTGCTCAAAC CCTTGTTCAG GAATAGCAGT 
5151 CTGGAATACC TCTATTCAGG CTGCAGACTA GCCTCACTCA GGCCAGAGAA 

52 01 GGATAGCTCA GCCATGGCAG TGGATGCCAT CTGCACACAT CGCCCTGACC 
5251 CTGAAGACCT CGGACTGGAC AGAGAGCGAC TGTACTGGGA GCTGAGCAAT 

53 01 CTGACAAATG GCATCCAGGA GCTGGGCCCC TACACCCTGG ACCGGAACAG 
5351 TCTCTATGTC AATGGTTTCA CCCATCGAAG CTCTATGCCC ACCACCAGCA 

54 01 CTCCTGGGAC CTCCACAGTG GATGTGGGAA CCTCAGGGAC TCCATCCTCC 
5451 AGCCCCAGCC CCACG 
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TABLE 14 



Amino Terminal Protein Sequence 
5 (SEQ ID NO: 82) 





1 


ESVLEGTVTS 


AYQVPSLSTR 


LTRTDGIMEH 


ITKIPNEAAH 


RGTIRPVKGP 


10 


51 


QTSTSPASPK 


GLHTGGTKRM 


ETTTTALKTT 


TTALKTTSRA 


TLTTSVYTPT 




101 


LGTLTPLNAS 


RQMASTILTE 


MMITTPYVFP 


DVPETTSSLA 


TSLGAETSTA 


15 


151 


LPRTTPSVLN 


RESETTASLV 


SRSGAERSPV 


IQTLDVSSSE 


PDTTASWVIH 


201 


PAETIPTVSK 


TTPNFFHSEL 


DTVSSTATSH 


GADVSSAIPT 


NISPSELDAL 




251 


TPLVTISGTD 


TSTTFPTLTK 


SPHETETRTT 


WLTHPAETSS 


TIPRTIPNFS 




301 


HHESDATPSI 


ATSPGAETSS 


AIPIMTVSPG 


AEDLVTSQVT 


SSGTDRNMTI 




351 


PTLTLSPGEP 


KTIASLVTHP 


EAQTSSAIPT 


STISPAVSRL 


VTSMVTSLAA 


iJl 


401 


KTSTTNRALT 


NSPGEPATTV 


SLVTHPAQTS 


PTVPWTTSIF 


FHSKSDTTPS 




451 


MTTSHGAESS 


SAVPTPTVST 


EVPGWTPLV 


TSSRAVISTT 


IPILTLSPGE 


0 


501 


PETTPSMATS 


HGEEASSAIP 


TPTVSPGVPG 


WTSLVTSSR 


AVTSTTIPIL 


ul 
31 


551 


TFSLGEPETT 


PSMATSHGTE 


AGSAVPTVLP 


EVPGMVTSLV 


ASSRAVTSTT 




601 


LPTLTLSPGE 


PETTPSMATS 


HGAEASSTVP 


TVSPEVPGW 


TSLVTSSSGV 


35 


obi 




0 IT wILJ-irj X X It 0 


MSTSHGAEAS 


SAVPTPTVSP 

J. V J- ^ J- -1. V iw* J- 


GVSGWTPLV 


701 


TSSRAVTSTT 


IPILTLSSSE 


PETTPSMATS 


HGVEASSAVL 


TVSPEVPGMV 




751 


TSLVTSSRAV 


TSTTIPTLTI 


SSDEPETTTS 


LVTHSEAKMI 


SAIPTLAVSP 


40 


801 


TVQGLVTSLV 


TSSGSETSAF 


SNLTVASSQP 


ETIDSWVAHP 


GTEASSWPT 




851 


LTVSTGEPFT 


NISLVTHPAE 


SSSTLPRTTS 


RFSHSELDTM 


PSTVTSPEAE 




901 


SSSAISTTIS 


PGIPGVLTSL 


VTSSGRDISA 


TFPTVPESPH 


ESEATASWVT 



45 
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TABLE 14 -continued 



5 Amino Terminal Protein Sequence 

(SEQ ID NO: 82) 



10 


951 


HPAVTSTTVP 


RTTPNYSHSE 


PDTTPSIATS 


PGAEATSDFP 


TITVSPDVPD 


1001 


MVTSQVTSSG 


TDTSITIPTL 


TLSSGEPETT 


TSFITYSETH 


TSSAIPTLPV 




1051 


SPGASKMLTS 


LVISSGTDST 


TTFPTLTETP 


YEPETTAIQL 


IHPAETNTMV 


15 


1101 


PRTTPKFSHS 


KSDTTLPVAI 


TSPGPEASSA 


VSTTTISPDM 


SDLVTSLVPS 




1151 


SGTDTSTTFP 


TLSETPYEPE 


TTATWLTHPA 


ETSTTVSGTI 


PNFSHRGSDT 




1201 


APSMVTSPGV 


DTRSGVPTTT 


IPPSIPGWT 


SQVTSSATDT 


STAIPTLTPS 




1251 


PGEPETTASS 


ATHPGTQTGF 


TVPIRTVPSS 


EPDTMASWVT 


HPPQTSTPVS 


y 1 


1301 


RTTSSFSHSS 


PDATPVMATS 


PRTEASSAVL 


TTISPGAPEM 


VTSQITSSGA 




1351 


ATSTTVPTLT 


HSPGMPETTA 


LLSTHPRTET 


SKTFPASTVF 


PQVSETTASL 




1401 


TIRPGAETST 


ALPTQTTSSL 


FTLLVTGTSR 


VDLSPTASPG 


VSAKTAPLST 


: f% 


1451 


HPGTETSTMI 


PTSTLSLGLL 


ETTGLLATSS 


SAETSTSTLT 


LTVSPAVSGL 




1501 


SSASITTDKP 


QTVTSWNTET 


SPSVTSVGPP 


EFSRTVTGTT 


MTLIPSEMPT 




1551 


PPKTSHGEGV 


SPTTILRTTM 


VEATNLATTG 


SSPTVAKTTT 


TFNTLAGSLF 


35 


1601 


TPLTTPGMST 


LASESVTSRT 


SYNHRSWIST 


TSSYNRRYWT 


PATSTPVTST 




1651 


FSPGISTSSI 


PSSTAATVPF 


MVPFTLNFTI 


TNLQYEEDMR 


HPGSRKFNAT 


40 


1701 


ERELQGLLKP 


LFRNSSLEYL 


YSGCRLASLR 


PEKDSSAMAV 


DAICTHRPDP 


1751 


EDLGLDRERL 


YWELSNLTNG 


IQELGPYTLD 


RNSLYVNGFT 


HRSSMPTTST 




1801 


PGTSTVDVGT 


SGTPSSSPSP 


T 







45 
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TABLE 15 



CA125 Repeat Nucleotide Sequence 
5 (SEQ ID NO: 83 thru SEQ ID NO: 145) 



10 


(SEQ ID NO: 83) 

1 GCCACAGTCC 


CATTCATGGT 


GCCATTCACC 


CTCAACTTCA 


CCATCACCAA 


51 


CCTGCAGTAC 


GAGGAGGACA 


TGCGGCACCC 


TGGTTCCAGG 


AAGTTCAACG 




101 


CCACAGAGAG 


AGAACTGCAG 


GGTCTGCTCA 


AACCCTTGTT 


CAGGAATAGC 


15 


151 


AGTCTGGAAT 


ACCTCTATTC 


AGGCTGCAGA 


CTAGCCTCAC 


TCAGGCCAGA 




201 


GAAGGATAGC 


TCAGCCATGG 


CAGTGGATGC 


CATCTGCATA 


CATCGCCCTG 




251 


ACCCTGAAGA 


CCTCGGACTG 


GACAGAGAGC 


GACTGTACTG 


GGAGCTGAGC 


m 


301 


AATCTGACAA 


ATGGCATCCA 


GGAGCTGGGC 


CCCTACACCC 


TGGACCGGAA 


rn 


351 


CAGTCTCTAT 


GTCAATGGTT 


TCACCCATCG 


AAGCTCTATG 


CCCACCACCA 


iy 


401 


GCACTCCTGG 


GACCTCCACA 


GTGGATGTGG 


GAACCTCAGG 


GACTCCATCC 


: i: ; 


451 


TCCAGCCCCA 


GCCCCACG 










(SEQ ID NO: 84) 

1 GCTGCTGGCC 


CTCTCCTGAT 


GCCGTTCACC 


CTCAACTTCA 


CCATCACCAA 




51 


CCTGCAGTAC 


GAGGAGGACA 


TGCGTCGCAC 


TGGCTCCAGG 


AAGTTCAACA 




101 


CCATGGAGAG 


TGTCCTGCAG 


GGTCTGCTCA 


AGCCCTTGTT 


CAAGAACACC 


35 


151 


AGTGTTGGCC 


CTCTGTACTC 


TGGCTGCAGA 


TTGACCTTGC 


TCAGGCCCAA 




201 


GAAAGATGGG 


GCAGCCACTG 


GAGTGGATGC 


CATCTGCACC 


CACCGCCTTG 


40 


251 


ACCCCAAAAG 


CCCTGGACTC 


AACAGGGAGC 


AGCTGTACTG 


GGAGCTAAGC 




301 


AAACTGACCA 


ATGACATTGA 


AGAGCTGGGC 


CCCTACACCC 


TGGACAGGAA 


45 


351 


CAGTCTCTAT 


GTCAATGGTT 


TCACCCATCA 


GAGCTCTGTG 


TCCACCACCA 


401 


GCACTCCTGG 


GACCTCCACA 


GTGGATCTCA 


GAACCTCAGG 


GACTCCATCC 
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TABLE 15- continued 



CA125 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru SEQ ID NO: 145) 

451 TCCCTCTCCA GCCCCACAAT TATG 
(SEQ ID NO: 85) 

1 GCTGCTGGCC CTCTCCTGGT ACCATTCACC CTCAACTTCA CCATCACCAA 

51 CCTGCAGTAT GGGGAGGACA TGGGTCACCC TGGCTCCAGG AAGTTCAACA 

101 CCACAGAGAG GGTCCTGCAG GGTCTGCTTG GTCCCATATT CAAGAACACC 

151 AGTGTTGGCC CTCTGTACTC TGGCTGCAGA CTGACCTCTC TCAGGTCTGA 

2 01 GAAGGATGGA GCAGCCACTG GAGTGGATGC CATCTGCATC CATCATCTTG 

2 51 ACCCCAAAAG CCCTGGACTC AACAGAGAGC GGCTGTACTG GGAGCTGAGC 

301 CAACTGACCA ATGGCATCAA AGAGCTGGGC CCCTACACCC TGGACAGGAA 

351 CAGTCTCTAT GTCAATGGTT TCACCCATCG GACCTCTGTG CCCACCACCA 

4 01 GCACTCCTGG GACCTCCACA GTGGACCTTG GAACCTCAGG GACTCCATTC 

451 TCCCTCCCAA GCCCCGCA 
(SEQ ID NO: 86) 

1 ACTGCTGGCC CTCTCCTGGT GCTGTTCACC CTCAACTTCA CCATCACCAA 

51 CCTGAAGTAT GAGGAGGACA TGCATCGCCC TGGCTCCAGG AAGTTCAACA 

101 CCACTGAGAG GGTCCTGCAG ACTCTGCTTG GTCCTATGTT CAAGAACACC 

151 AGTGTTGGCC TTCTGTACTC TGGCTGCAGA CTGACCTTGC TCAGGTCCGA 

201 GAAGGATGGA GCAGCCACTG GAGTGGATGC CATCTGCACC CACCGTCTTG 

251 ACCCCAAAAG CCCTGGACTG GACAGAGAGC AGCTATACTG GGAGCTGAGC 

301 CAGCTGACCA ATGGCATCAA AGAGCTGGGC CCCTACACCC TGGACAGGAA 
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TABLE 15 -continued 



31 



CA125 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru SEQ ID NO: 145) 



351 CAGTCTCTAT GTCAATGGTT TCACCCATTG GATCCCTGTG CCCACCAGCA 

10 4 01 GCACTCCTGG GACCTCCACA GTGGACCTTG GGTCAGGGAC TCCATCCTCC 

451 CTCCCCAGCC CCACA 
(SEQ ID NO: 87) 

15 1 GCTGCTGGCC CTCTCCTGGT GCCATTCACC CTCAACTTCA CCATCACCAA 

51 CCTGCAGTAC GAGGAGGACA TGCATCACCC AGGCTCCAGG AAGTTCAACA 

101 CCACGGAGCG GGTCCTGCAG GGTCTGCTTG GTCCCATGTT CAAGAACACC 

151 AGTGTCGGCC TTCTGTACTC TGGCTGCAGA CTGACCTTGC TCAGGTCCGA 

5 201 GAAGGATGGA GCAGCCACTG GAGTGGATGC CATCTGCACC CACCGTCTTG 

111 

25J 251 ACCCCAAAAG CCCTGGAGTG GACAGGGAGC AGCTATACTG GGAGCTGAGC 

ro 301 CAGCTGACCA ATGGCATCAA AGAGCTGGGT CCCTACACCC TGGACAGAAA 



351 CAGTCTCTAT GTCAATGGTT TCACCCATCA GACCTCTGCG CCCAACACCA 
401 GCACTCCTGG GACCTCCACA GTGGACCTTG GGACCTCAGG GACTCCATCC 
451 TCCCTCCCCA GCCCTACA 



35 (SEQ ID NO: 88) 

1 NCNNCTGNCC CTCTCCTGNT NCCNTTCACC NTCAACTTNA CCATCACCAA 

51 CCTGCANTAN GNGGANNACA TGCNNCNCCC NGGNTCCAGG AAGTTCAACA 

40 101 CCACNGAGNG NGTNCTGCAG GGTCTGCTNN NNCCCNTNTT CAAGAACACC 

151 AGTGTTGGCC CTCTGTACTC TGGCTGCAGA CTGACCTTGC TCAGGTCCGA 

2 01 GAAGGATGGA GCAGCCACTG GAGTGGATGC CATCTGCACC CACCGTCTTG 

45 

251 ACCCCAAAAG CCCTGGAGTG GACAGGGAGC AGCTATACTG GGAGCTGAGC 
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TABLE 15 -continued 



CA125 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru SEQ ID NO: 145) 

3 01 CAGCTGACCA ATGGCATCAA AGAGCTGGGT CCCTACACCC TGGACAGAAA 
351 CAGTCTCTAT GTCAATGGTT TCACCCATCA GACCTCTGCG CCCAACACCA 

4 01 GCACTCCTGG GACCTCCACA GTGGACCTTG GGACCTCAGG GACTCCATCC 
451 TCCCTCCCCA GCCCTACA 

(SEQ ID NO: 89) 



1 


TCTGCTGGCC 


CTCTCCTGGT 


GCCATTCACC 


L i i i LA 




51 


CCTGCAGTAC 


GAGGAGGACA 


TGCATCACCC 


AGGCTCCAGG 


AAGTTCAACA 


101 


CCACGGAGCG 


GGTCCTGCAG 


GGTCTGCTTG 


GTCCCATGTT 


CAAGAACACC 


151 


AGTGTCGGCC 


TTCTGTACTC 


TGGCTGCAGA 


CTGACCTTGC 


TCAGGCCTGA 


201 


GAAGAATGGG 


GCAGCCACTG 


GAATGGATGC 


CATCTGCAGC 


CACCGTCTTG 


251 


ACCCCAAAAG 


CCCTGGACTC 


AACAGAGAGC 


AGCTGTACTG 


GGAGCTGAGC 


301 


CAGCTGACCC 


ATGGCATCAA 


AGAGCTGGGC 


CCCTACACCC 


TGGACAGGAA 


351 


CAGTCTCTAT 


GTCAATGGTT 


TCACCCATCG 


GAGCTCTGTG 


GCCCCCACCA 


401 


GCACTCCTGG 


GACCTCCACA 


GTGGACCTTG 


GGACCTCAGG 


GACTCCATCC 


451 


TCCCTCCCCA 


GCCCCACA 








ID NO: 90) 

1 ACAGCTGTTC 


CTCTCCTGGT 


GCCGTTCACC 


CTCAACTTTA 


CCATCACCAA 


51 


TCTGCAGTAT 


GGGGAGGACA 


TGCGTCACCC 


TGGCTCCAGG 


AAGTTCAACA 


101 


CCACAGAGAG 


GGTCCTGCAG 


GGTCTGCTTG 


GTCCCTTGTT 


CAAGAACTCC 


151 


AGTGTCGGCC 


CTCTGTACTC 


TGGCTGCAGA 


CTGATCTCTC 


TCAGGTCTGA 


201 


GAAGGATGGG 


GCAGCCACTG 


GAGTGGATGC 


CATCTGCACC 


CACCACCTTA 
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TABLE 15 -continued 



CA125 Repeat Nucleotide Sequence 
5 (SEQ ID NO: 83 thru SEQ ID NO: 145) 







251 


ACCCTCAAAG 


CCCTGGACTG 


GACAGGGAGC 


AGCTGTACTG 


GCAGCTGAGC 


10 




301 


CAGATGACCA 


ATGGCATCAA 


AGAGCTGGGC 


CCCTALALLL 








351 


CAGTCTCTAC 


GTCAATGGTT 


TCACCCATCG 


GAGC id CjCjCj 




15 




401 


GCACTCCTTG 


GACTTCCACA 


GTTGACCTTG 


GAACCl LAGG 


r" A PTTT' A T P P 




451 


CCCGTCCCCA 


GCCCCACA 










(SEQ 


ID NO: 91) 

1 ACTGCTGGCC 


CTCTCCTGGT 


GCCATTCACC 


CTCAACTTCA 


CCATCACCAA 






51 


CCTGCAGTAT 


GAGGAGGACA 


TGCATCGCCC 


TGGATCTAGG 


AAGTTCAACA 






101 


CCACAGAGAG 


GGTCCTGCAG 


GGTCTGCTTA 


GTCCCATTTT 


CAAGAACTCC 






151 


AGTGTTGGCC 


CTCTGTACTC 


TGGCTGCAGA 


CTGACCTCTC 


TCAGGCCCGA 






201 


GAAGGATGGG 


GCAGCAACTG 


GAATGGATGC 


TGTCTGCCTC 


TACCACCCTA 






251 


ATCCCAAAAG 


ACCTGGACTG 


GACAGAGAGC 


AGCTGTACTG 


GGAGCTAAGC 


3i 




301 


CAGCTGACCC 


ACAACATCAC 


TGAGCTGGGC 


CCCTACAGCC 


TGGACAGGGA 






351 


CAGTCTCTAT 


GTCAATGGTT 


TCACCCATCA 


GAACTCTGTG 


CCCACCACCA 


35 




401 
451 


GTACTCCTGG 
TCCTTCCCCG 


GACCTCCACA 
GCCACACA 


GTGTACTGGG 


CAACCACTGG 


GACTCCATCC 


40 


(SEQ 


ID NO: 92) 

1 GAGCCTGGCC 


CTCTCCTGAT 


ACCATTCACT 


TTCAACTTTA 


CCATCACCAA 






51 


CCTGCATTAT 


GAGGAAAACA 


TGCAACACCC 


TGGTTCCAGG 


AAGTTCAACA 


45 




101 


CCACGGAGAG 


GGTTCTGCAG 


GGTCTGCTCA 


AGCCCTTGTT 






151 


AGTGTTGGCC 


CTCTGTACTC 


TGGCTGCAGA 


CTGACCTCTC 


TCAGGCCCGA 
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TABLE 15 -continued 



CA12 5 Repeat Nucleotide Sequence 
5 (SEQ ID NO: 83 thru SEQ ID NO; 145) 







201 


GAAGGATGGG 


GCAGCAACTG 


GAATGGATGC 


TGTCTGCCTC 


TACCACCCTA 


10 




251 


ATCCCAAAAG 


ACCTGGGCTG 


GACAGAGAGC 


AGCTGTACTG 


GGAGCTAAGC 






301 


CAGCTGACCC 


ACAACATCAC 


TGAGCTGGbL 


L L 1 AL- A^^L. 




15 




351 


CAGTCTCTAT 


GTCAATGGTT 


TCACCCATCA 


GAAC 1 L 1 b 1 






401 
451 


GTACTCCTGG 
TCCTTCCCCG 


GACCTCCACA 
GCCACACA 


GTGTACTGGG 


LAALLAL 1 bb 




20, 


(SEQ 


ID NO: 93) 

1 GAGCCTGGCC 


CTCTCCTGAT 


ACCATTCACT 


TTCAACTTTA 


CCATCACCAA 


Liil 




51 


CCTGCATTAT 


GAGGAAAACA 


TGCAACACCC 


TGGTTCCAGG 




2si 




101 


CCACGGAGAG 


GGTTCTGCAG 


GGTCTGCTCA 


AGCCCTTGTT 


CAAGAACACC 






151 


AGTGTTGGCC 


CTCTGTACTC 


TGGCTGCAGA 


CTGACCTTGC 


TCAGACCTGA 


list. 




201 


GAAGCATGAG 


GCAGCCACTG 


GAGTGGACAC 


CATCTGTACC 


CACCGCGTTG 






251 


ATCCCATCGG 


ACCTGGACTG 


GACAGGGAGC 


GGCTATACTG 


GGAGCTGAGC 






301 


CAGCTGACCA 


ACAGCATTAC 


CGAACTGGGA 


CCCTACACCC 


TGCjALAbbbA 


35 




351 


CAGTCTCTAT 


GTCAATGGCT 


TCAACCCTCG 


GAGCTCTGTG 


CCAACCACCA 






401 


GCACTCCTGG 


GACCTCCACA 


GTGCACCTGG 


CAACCTCTGG 


GACTCCATCC 


40 




451 


TCCCTGCCTG 


GCCACACA 








(SEQ 


ID NO: 94) 

1 GCCCCTGTCC 


CTCTCTTGAT 


ACCATTCACC 


CTCAACTTTA 


CCATCACCAA 


45 




51 


CCTGCATTAT 


GAGGAAAACA 


TGCAACACCC 


TGGTTCCAGG 


AAGTTCAACA 




101 


CCACGGAGAG 


GGTTCTGCAG 


GGTCTGCTCA AGCCCTTGTT 


CAAGAACACC 



TABLE 15 -continued 



CA125 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru SEQ ID NO: 145) 



151 AGTGTTGGCC CTCTGTACTC TGGCTGCAGA CTGACCTTGC TCAGACCTGA 

10 2 01 GAAGCATGAG GCAGCCACTG GAGTGGACAC CATCTGTACC CACCGCGTTG 

251 ATCCCATCGG ACCTGGACTG NACAGNGAGC NGCTNTACTG GGAGCTNAGC 

301 CANCTGACCA ANNNCATCNN NGAGCTGGGN CCCTACACCC TGGACAGGNA 

15 

351 CAGTCTCTAT GTCAATGGTT TCACCCATCN GANCTCTGNG CCCACCACCA 

401 GCACTCCTGG GACCTCCACA GTGNACNTNG GNACCTCNGG GACTCCATCC 

2Qk 451 TCCNTCCCCN GCCNCACA 
3 (SEQ ID NO: 95) 

m 1 TCTGCTGGCC CTCTCCTGGT GCCATTCACC CTCAACTTCA CCATCACCAA 

I m 

2|i 51 CCTGCAGTAC GAGGAGGACA TGCATCACCC AGGCTCCAGG AAGTTCAACA 

fO 101 CCACGGAGCG GGTCCTGCAG GGTCTGCTTG GTCCCATGTT CAAGAACACC 

151 AGTGTCGGCC TTCTGTACTC TGGCTGCAGA CTGACCTTGC TCAGGCCTGA 

^'^'i 2 01 GAAGAATGGG GCAGCCACTG GAATGGATGC CATCTGCAGC CACCGTCTTG 

'"-4 

251 ACCCCAAAAG CCCTGGACTC GACAGAGAGC AGCTGTACTG GGAGCTGAGC 

35 3 01 CAGCTGACCC ATGGCATCAA AGAGCTGGGC CCCTACACCC TGGACAGGAA 

351 CAGTCTCTAT GTCAATGGTT TCACCCATCG GAGCTCTGTG GCCCCCACCA 

4 01 GCACTCCTGG GACCTCCACA GTGGACCTTG GGACCTCAGG GACTCCATCC 

451 TCCCTCCCCA GCCCCACA 



40 
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TABLE 15 -continued 



5 CA125 Repeat Nucleotide Sequence 

(SEQ ID NO: 83 thru SEQ ID NO: 145) 



10 


(SEQ 


ID NO: 96) 

1 ACAGCTGTTC 


CTCTCCTGGT 


GCCGTTCACC 


CTCAACTTTA 


CCATCACCAA 






51 


TCTGCAGTAT 


GGGGAGGACA 


TGCGTCACCC 


TGGCTCCAGG 


AAGTTCAACA 


15 




101 


CCACAGAGAG 


GGTCCTGCAG 


GGTCTGCTTG 


GTCCCTTGTT 


CAAGAACTCC 




151 


AGTGTCGGCC 


CTCTGTACTC 


TGGCTGCAGA 


CTGATCTCTC 


TCAGGTCTGA 






201 


GAAGGATGGG 


GCAGCCACTG 


GAGTGGATGC 


CATCTGCACC 


CACCACCTTA 






251 


ACCCTCAAAG 


CCCTGGACTG 


GACAGGGAGC 


AGCTGTACTG 


GCAGCTGAGC 






301 


CAGATGACCA 


ATGGLAiLAA 






TGGACCGGAA 


m 




351 


CAGTCTCTAC 


Cj 1 C AA i UL? 1 1 






CTCACCACCA 


254 




401 
451 


GCACTCCTTG 
CCCGTCCCCA 


GACTTCCALA 
GCCCCALA 


Lit 1 1 oAU 1 1 o 


paappTPAPP 


GACTCCATCC 


3| 


(SBQ 


ID NO: 97) 

1 ACTGCTGGCC 


CTCTCCTGGT 


GCCATTCACC 


CTAAACTTCA 


CCATCACCAA 






51 


CCTGCAGTAT 


GAGGAGGACA 


TGCATCGCCC 


TGGATCTAGG 


AAGTTCAACG 


35 




101 


CCACAGAGAG 


GGTCCTGCAG 


GGTCTGCTTA 


GTCCCATATT 


CAAGAACTCC 






151 


AGTGTTGGCC 


CTCTGTACTC 


TGGCTGCAGA 


CTGACCTCTC 


TCAGGCCCGA 


40 




201 


GAAGGATGGG 


GCAGCAACTG 


GAATGGATGC 


TGTCTGCCTC 


TACCACCCTA 




251 


ATCCCAAAAG 


ACCTGGACTG 


GACAGAGAGC 


AGCTGTACTG 


GGAGCTAAGC 






301 


CAGCTGACCC 


ACAACATCAC 


TGAGCTGGGC 


CCCTACAGCC 


TGGACAGGGA 


45 




351 


CAGTCTCTAT 


GTCAATGGTT 


TCACCCATCA 


GAGCTCTATG 


ACGACCACCA 



TABLE 15 -continued 



CA125 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru SEQ ID NO: 145) 







401 


GAACTCCTGA 


TACCTCCACA 


ATGCACCTGG 


CAACCTCGAG 


AACTCCAGCC 


10 




451 


TCCCTGTCTG 


GACCTACG 










(SEQ 


ID NO: 98) 

1 ACCGCCAGCC 


r" T f T r r TOf^T 


GCTATTCACA 


ATCAACTGCA 


CCATCACCAA 


15 




51 


CCTGCAGTAC 




TGCGTCGCAC 


TGGCTCCAGG 


AAGTTCAACA 






101 


CCATGGAGAG 




GGTCTGCTCA AGCCCTTGTT 


CAAGAACACC 






151 


AGTGTTGGCC 


X vj X ri^ X ^ 


TGGCTGCAGA 


TTGACCTTGC 


TCAGGCCCAA 


2% 




201 


GAAAGATGGG 




GAGTGGATGC 


CATCTGCACC 


CACCGCCTTG 


i'i \ 




251 


ACCCCAAAAG 


CCCTGGACTC 


AACAGGGAGC 


AGCTGTACTG 


GGAGCTAAGC 






301 


AAACTGACCA 


ATGACATTGA 


AGAGCTGGGC 


CCCTACACCC 


TGGACAGGAA 






351 


CAGTCTCTAT 


GTCAATGGTT 


TCACCCATCA 


GAGCTCTGTG 


TCCACCACCA 






401 


GCACTCCTGG 


GACCTCCACA 


GTGGATCTCA 


GAACCTCAGG 


GACTCCATCC 


30! 




451 


TCCCTCTCCA 


GCCCCACAAT 


TATG 






35 


(SEQ 


ID NO: 99) 

1 NCNNCTGNCC 


pi nn rn /-I rp p T rn 


NCCNTTCACC 


NTCAACTTNA 


CCATCACCAA 




51 


CCTGCANTAN 




TGCNNCNCCC 


NGGNTCCAGG 


AAGTTCAACA 






101 


CCACNGAGAG 


GGTCCTACAG 


GGTCTGCTCA 


GGCCCTTGTT 


CAAGAACACC 


40 




151 


AGTGTCAGCT 


CTCTGTACTC 


TGGTTGCAGA 


CTGACCTTGC 


TCAGGCCTGA 






201 


GAAGGATGGG 


GCAGCCACCA 


GAGTGGATGC 


TGCCTGCACC 


TACCGCCCTG 


45 




251 


ATCCCAAAAG 


CCCTGGACTG 


GACAGAGAGC 


AACTATACTG 


GGAGCTGAGC 




301 


CAGCTAACCC 


ACAGCATCAC 


TGAGCTGGGA 


CCCTACACCC 


TGGACAGGGT 



TABLE 15 -continued 



CA125 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru 145) 



3 51 CAGTCTCTAT GTCAATGGCT TCAACCCTCG GAGCTCTGTG CCAACCACCA 

10 401 GCACTCCTGG GACCTCCACA GTGCACCTGG CAACCTCTGG GACTCCATCC 

451 TCCCTGCCTG GCCACACA 
(SEQ ID NO: 100) 

15 1 GCCCCTGTCC CTCTCTTGAT ACCATTCACC CTCAACTTTA CCATCACCAA 

51 CCTGCATTAT GAAGAAAACA TGCAACACCC TGGTTCCAGG AAGTTCAACA 

101 CCACGGAGAG GGTTCTGCAG GGTCTGCTCA AGCCCTTGTT CAAGAGCACC 

151 AGCGTTGGCC CTCTGTACTC TGGCTGCAGA CTGACCTTGC TCAGACCTGA 

201 GAAACATGGG GCAGCCACTG GAGTGGACGC CATCTGCACC CTCCGCCTTG 

2S!\ 251 ATCCCACTGG TCCTGGACTG GACAGAGAGC GGCTATACTG GGAGCTGAGC 

3 01 CAGCTGACCA ACAGCGTTAC AGAGCTGGGC CCCTACACCC TGGACAGGGA 

i.ri 351 CAGTCTCTAT GTCAATGGCT TCACCCAGCG GAGCTCTGTG CCAACCACCA 

'•■J 401 GTATTCCTGG GACCTCTGCA GTGCACCTGG AAACCTCTGG GACTCCAGCC 

451 TCCCTGCCTG GCCACACA 



2ffl 



35 (SEQ ID NO: 101) 

1 GCCCCTGGCC CTCTCCTGGT GCCATTCACC CTCAACTTCA CTATCACCAA 

51 CCTGCAGTAT GAGGTGGACA TGCGTCACCC TGGTTCCAGG AAGTTCAACA 

40 101 CCACGGAGAG AGTCCTGCAG GGTCTGCTCA AGCCCTTGTT CAAGAGCACC 

151 AGTGTTGGCC CTCTGTACTC TGGCTGCAGA CTGACCTTGC TCAGGCCTGA 

201 AAAACGTGGG GCAGCCACCG GCGTGGACAC CATCTGCACT CACCGCCTTG 

45 

251 ACCCTCTAAA CCCTGGACTG GACAGAGAGC AGCTATACTG GGAGCTGAGC 
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TABLE 15 -continued 



CA125 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru SEQ ID NO: 145) 



301 AAACTGACCC GTGGCATCAT 

351 CAGTCTCTAT GTCAATGGTT 

4 01 GCACTCCTGG GACCTCCACA 

451 TCCCTACCTA GACCCATA 

(SEQ ID NO: 102) 

1 GTGCCTGGCC CTCTCCTGGT 

51 CTTGCAGTAT GAGGAGGCCA 

101 CCACGGAGAG GGTCCTACAG 

151 AGTATCGGCC CTCTGTACTC 

201 GAAGGACAAG GCAGCCACCA 

251 ACCCTCAAAG CCCTGGACTG 

3 01 CAGCTGACCC ACGGCATCAC 

351 CAGTCTCTAT GTCGATGGTT 

401 GCACTCCTGG GACCTCCATA 

451 TCCCTCCCTG AAACTACA 

(SEQ ID NO: 103) 

1 NCNNCTGNCC CTCTCCTGNT 

51 CCTGCANTAN GNGGANNACA 

101 CCACNGAGAG GGTTCTGCAG 

151 AGTCTGGAAT ACCTCTATTC 

201 GAAGGATAGC TCAGCCATGG 



CGAGCTGGGC CCCTACCTCC TGGACAGAGG 
TCACCCATCG GAACTTTGTG CCCATCACCA 
GTACACCTAG GAACCTCTGA AACTCCATCC 

GCCATTCACC CTCAACTTCA CCATCACCAA 
TGCGACACCC TGGCTCCAGG AAGTTCAATA 
GGTCTGCTCA GGCCCTTGTT CAAGAATACC 
CAGCTGCAGA CTGACCTTGC TCAGGCCAGA 
GAGTGGATGC CATCTGTACC CACCACCCTG 
AACAGAGAGC AGCTGTACTG GGAGCTGAGC 
TGAGCTGGGC CCCTACACCC TGGACAGGGA 
TCACTCATTG GAGCCCCATA CCGACCACCA 
GTGAACCTGG GAACCTCTGG GATCCCACCT 

NCCNTTCACC NTCAACTTNA CCATCACCAA 
TGCNNCNCCC NGGNTCCAGG AAGTTCAACA 
GGTCTGCTCA AACCCTTGTT CAGGAATAGC 
AGGCTGCAGA CTAGCCTCAC TCAGGCCAGA 
GAGTGGATGC CATCTGCACA CATCGCCCTG 
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TABLE 15 -continued 



10 



15 



CA125 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru 145) 



2 51 ACCCTGAAGA CCTCGGACTG GACAGAGAGC GACTGTACTG GGAGCTGAGC 

3 01 AATCTGACAA ATGGCATCCA GGAGCTGGGC CCCTACACCC TGGACCGGAA 
351 CAGTCTCTAC GTCAATGGTT TCACCCATCG GAGCTCTGGG CTCACCACCA 

4 01 GCACTCCTTG GACTTCCACA GTTGACCTTG GAACCTCAGG GACTCCATCC 
451 CCCGTCCCCA GCCCCACA 

(SEQ ID NO: 104) 





1 


ACTGCTGGCC 


CTCTCCTGGT 


GCCATTCACC 


CTCAACTTCA 


CCATCACCAA 




51 


CCTGCAGTAT 


GAGGAGGACA 


TGCATCGCCC 


TGGTTCCAGG 


AGGTTCAACA 


ffi 


101 


CCACGGAGAG 


GGTTCTGCAG 


GGTCTGCTCA 


CGCCCTTGTT 


CAAGAACACC 




151 


AGTGTTGGCC 


CTCTGTACTC 


TGGCTGCAGA 


CTGACCTTGC 


TCAGACCTGA 




201 


GAAGCAAGAG 


GCAGCCACTG 


GAGTGGACAC 


CATCTGTACC 


CACCGCGTTG 




251 


ATCCCATCGG 


ACCTGGACTG 


GACAGAGAGC 


GGCTATACTG 


GGAGCTGAGC 




301 


CAGCTGACCA 


ACAGCATCAC 


AGAGCTGGGA 


CCCTACACCC 


TGGATAGGGA 




351 


CAGTCTCTAT 


GTCAATGGCT 


TCAACCCTTG 


GAGCTCTGTG 


CCAACCACCA 


35 


401 
451 


GCACTCCTGG 
TCCCTGCCTG 


GACCTCCACA 
GCCACACA 


GTGCACCTGG 


CAACCTCTGG 


GACTCCATCC 


40 


(SEQ ID NO: 105) 

1 GCCCCTGTCC 


CTCTCTTGAT 


ACCATTCACC 


CTCAACTTTA 


CCATCACCGA 




51 


CCTGCATTAT 


GAAGAAAACA 


TGCAACACCC 


TGGTTCCAGG 


AAGTTCAACA 


45 


101 


CCACGGAGAG 


GGTTCTGCAG 


GGTCTGCTCA 


AGCCCTTGTT 


CAAGAGCACC 


151 


AGCGTTGGCC 


CTCTGTACTC 


TGGCTGCAGA 


CTGACCTTGC 


TCAGACCTGA 
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TABLE 15 -continued 



CA125 Repeat Nucleotide Sequence 

(SEQ ID NO: 83 thru SEQ ID NO: 145) 







201 


GAAACATGGG 


GCAGCCACTG 


GAGTGGACGC 


CATCTGCACC 


CTCCGCCTTG 






ZD ± 




TCCTGGACTG 


GACAGAGAGC 


GGCTATACTG 


GGAGCTGAGC 






301 


CAGCTGACCA 


ACAGCGTTAC 


AGAGCTGGGC 


CCCTACACCC 


TGGACAGGGA 


15 




351 


CAGTCTCTAT 


GTCAATGGCT 


TCACCCATCG 


GAGCTCTGTG 


CCAACCACCA 




401 
451 


GTATTCCTGG 
TCCCTCCCTG 


GACCTCTGCA 
GCCACACA 


GTGCACCTGG 


AAACCTCTGG 


GACTCCAGCC 




(SEQ 


ID NO: 106) 

1 GCCCCTGGCC 


CTLlLLlOtal 




PTPAAPTTCA 


CTATCACCAA 


f ^ 




51 


CCTGCAGTAT 


GAGGACjbALA 




Tr^r^TTPPAGG 


AAGTTCAGCA 


25i 




101 


CCACGGAGAG 


AGl LL 1 CaCAvjj 


PPT'PTPPTPZ^ 


AGPPPTTGTT 


CAAGAACACC 


lu 




151 


AGTGTCAGCT 




TPPTTPPZ\(^Zi 


PTGAPPTTGC 


TCAGGCCTGA 


[3 




201 


GAAGGATGGG 




PAQTGGATGC 


TGTCTGCACC 


CATCGTCCTG 






251 


ACCCCAAAAG 




GACAGAGAGC 


GGCTGTACTG 


GAAGCTGAGC 






301 


CAGGTGACCC 


ACGGCATCAC 


TGAGCTGGGC 


CCCTACACCC 


TGGACAGGGA 


35 




351 


CAGTCTCTAT 


GTCAATGGTT 


TCACCCATCA 


GAGCTCTATG 


ACGACCACCA 






401 


GAACTCCTGA 


TACCTCCACA 


ATGCACCTGG 


CAACCTCGAG 


AACTCCAGCC 


40 




451 


TCCCTGTCTG 


GACCTACG 








(SEQ 


ID NO: 107) 

1 ACCGCCAGCC 


CTCTCCTGGT 


GCTATTCACA 


ATTAACTTCA 


CCATCACTAA 


45 




51 


CCTGCGGTAT 


GAGGAGAACA 


TGCATCACCC 


TGGCTCTAGA 


AAGTTTAACA 




101 


CCACGGAGAG 


AGTCCTTCAG 


GGTCTGCTCA 


GGCCTGTGTT 


CAAGAACACC 
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TABLE 15 -continued 



CA125 Repeat Nucleotide Sequence 
(SEQ ID NOs 83 thru SEQ ID NO: 145) 

151 AGTGTTGGCC CTCTGTACTC TGGCTGCAGA CTGACCACGC TCAGGCCCAA 

2 01 GAAGGATGGG GCAGCCACCA AAGTGGATGC CATCTGCACC TACCGCCCTG 

251 ATCCCAAAAG CCCTGGACTG GACAGAGAGC AGCTATACTG GGAGCTGAGC 

301 CAGCTAACCC ACAGCATCAC TGAGCTGGGC CCCTACACCC AGGACAGGGA 

351 CAGTCTCTAT GTCAATGGCT TCACCCATCG GAGCTCTGTG CCAACCACCA 

401 GTATTCCTGG GACCTCTGCA GTGCACCTGG AAACCTCTGG GACTCCAGCC 

451 TCCCTCCCTG GCCACACA 
(SEQ ID NO: 108) 



1 


GCCCCTGGCC 


CTCTCCTGGT 


GCCATTCACC 


CTCAACTTCA 


CTATCACCAA 


51 


CCTGCAGTAT 


GAGGAGGACA 


TGCGTCACCC 


TGGTTCCAGG 


AAGTTCAACA 


101 


CCACGGAGAG 


AGTCCTGCAG 


GGTCTGCTCA 


AGCCCTTGTT 


CAAGAGCACC 


151 


AGTGTTGGCC 


CTCTGTACTC 


TGGCTGCAGA 


CTGACCTTGC 


TCAGGCCTGA 


201 


AAAACGTGGG 


GCAGCCACCG 


GCGTGGACAC 


CATCTGCACT 


CACCGCCTTG 


251 


ACCCTCTAAA 


CCCAGGACTG 


GACAGAGAGC 


AGCTATACTG 


GGAGCTGAGC 


301 


AAACTGACCC 


GTGGCATCAT 


CGAGCTGGGC 


CCCTACCTCC 


TGGACAGAGG 


351 


CAGTCTCTAT 


GTCAATGGTT 


TCACCCATCG 


GACCTCTGTG 


CCCACCACCA 


401 


GCACTCCTGG 


GACCTCCACA 


GTGGACCTTG 


GAACCTCAGG 


GACTCCATTC 


451 


TCCCTCCCAA 


GCCCCGCA 








ID NO: 109) 

1 NCNNCTGNCC 


CTCTCCTGNT 


NCCNTTCACC 


NTCAACTTNA 


CCATCACCAA 


51 


CCTGCANTAN 


GNGGANNACA 


TGCNNCNCCC 


NGGNTCCAGG 


AAGTTCAACA 
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TABLE 15 -continued 



CA125 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru SEQ ID NO: 145) 



101 CCACNGAGAG GGTCCTGCAG ACTCTGCTTG GTCCTATGTT CAAGAACACC 

10 151 AGTGTTGGCC TTCTGTACTC TGGCTGCAGA CTGACCTTGC TCAGGTCCGA 

201 GAAGGATGGA GCAGCCACTG GAGTGGATGC CATCTGCACC CACCGTCTTG 

251 ACCCCAAAAG CCCTGGAGTG GACAGGGAGC AACTATACTG GGAGCTGAGC 

15 

301 CAGCTGACCA ATGGCATTAA AGAACTGGGC CCCTACACCC TGGACAGGAA 

351 CAGTCTCTAT GTCAATGGGT TCACCCATTG GATCCCTGTG CCCACCAGCA 

401 GCACTCCTGG GACCTCCACA GTGGACCTTG GGTCAGGGAC TCCATCCTCC 

451 CTCCCCAGCC CCACA 

ill (SEQ ID NO: 110) 

2Sj 1 ACTGCTGGCC CTCTCCTGGT GCCGTTCACC CTCAACTTCA CCATCACCAA 

m 51 CCTGAAGTAC GAGGAGGACA TGCATTGCCC TGGCTCCAGG AAGTTCAACA 

0 101 CCACAGAGAG AGTCCTGCAG AGTCTGCTTG GTCCCATGTT CAAGAACACC 

|« 151 AGTGTTGGCC CTCTGTACTC TGGCTGCAGA CTGACCTTGC TCAGGTCCGA 

f'j 201 GAAGGATGGA GCAGCCACTG GAGTGGATGC CATCTGCACC CACCGTCTTG 

35 251 ACCCCAAAAG CCCTGGAGTG GACAGGGAGC AGCTATACTG GGAGCTGAGC 

3 01 CAGCTGACCA ATGGCATCAA AGAGCTGGGT CCCTACACCC TGGACAGAAA 

351 CAGTCTCTAT GTCAATGGTT TCACCCATCA GACCTCTGCG CCCAACACCA 

40 

401 GCACTCCTGG GACCTCCACA GTGGACCTTG GGACCTCAGG GACTCCATCC 

451 TCCCTCCCCA GCCCTACA 

45 
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TABLE 15 -continued 



10 



45 



CA125 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru SEQ ID NO: 145) 



(SEQ ID NO: 111) 

1 NCNNCTGNCC CTCTCCTGNT NCCNTTCACC NTCAACTTNA CCATCACCAA 



51 CCTGCANTAN GNGGANNACA TGCNNCNCCC NGGNTCCAGG AAGTTCAACA 

101 CCACNGAGNG NGTNCTGCAG GGTCTGCTNN NNCCCNTNTT CAAGAACNCC 

15 151 AGTGTNGGCC NTCTGTACTC TGGCTGCAGA CTGACCTNNC TCAGGNCNGA 

201 GAAGNATGGN GCAGCCACTG GANTGGATGC CATCTGCANC CACCNNCNTN 

251 ANCCCAAAAG NCCTGGACTG NACAGNGAGC NGCTNTACTG GGAGCTNAGC 

2.0. 

3 01 CANCTGACCA ANNNCATCNN NGAGCTGGGN CCCTACACCC TGGACAGGNA 

% 351 CAGTCTCTAT GTCAATGGTT TCACCCATTG GATCCCTGTG CCCACCAGCA 

M 401 GCACTCCTGG GACCTCCACA GTGGACCTTG GGTCAGGGAC TCCATCCTCC 

S 451 CTCCCCAGCC CCACA 

0 (SEQ ID NO: 112) 

3i 1 ACTGCTGGCC CTCTCCTGGT GCCGTTCACC CTCAACTTCA CCATCACCAA 

H 51 CCTGAAGTAC GAGGAGGACA TGCATTGCCC TGGCTCCAGG AAGTTCAACA 

^" 101 CCACAGAGAG AGTCCTGCAG AGTCTGCTTG GTCCCATGTT CAAGAACACC 

35 

151 AGTGTTGGCC CTCTGTACTC TGGCTGCAGA CTGACCTCGC TCAGGTCCGA 

201 GAAGGATGGA GCAGCCACTG GAGTGGATGC CATCTGCACC CACCGTGTTG 

40 251 ACCCCAAAAG CCCTGGAGTG GACAGGGAGC AGCTATACTG GGAGCTGAGC 

301 CAGCTGACCA ATGGCATCAA AGAGCTGGGT CCCTACACCC TGGACAGAAA 

3 51 CAGTCTCTAT GTCAATGGTT TCACCCATCA GACCTCTGCG CCCAACACCA 

401 GCACTCCTGG GACCTCCACA GTGNACNTNG GNACCTCNGG GACTCCATCC 
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TABLE 15 -continued 



CA125 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru SEQ ID NO: 145) 



451 TCCNTCCCCN GCCNCACA 

(SEQ ID NO: 113) 

1 TCTGCTGGCC CTCTCCTGGT 

51 CCTGCAGTAC GAGGAGGACA 

101 CCACGGAGCG GGTCCTGCAG 

151 AGTGTCGGCC TTCTGTACTC 

201 GAAGAATGGG GCAACCACTG 

251 ACCCCAAAAG CCCTGGACTG 

301 CANCTGACCA AMSINCATCNN 

351 CAGTCTCTAT GTCAATGGTT 

4 01 GCACTCCTGG GACCTCCACA 

451 TCCNTCCCCN GCCNCACA 

(SEQ ID NO: 114) 

1 NCNNCTGNCC CTCTCCTGNT 

51 CCTGCANTAN GNGGANNACA 

101 CCACNGAGAG GGTTCTGCAG 

151 AGTCTGGAAT ACCTCTATTC 

201 GAAGGATAGC TCAGCCATGG 

251 ACCCTGAAGA CCTCGGACTG 

301 AATCTGACAA ATGGCATCCA 

351 CAGTCTCTAT GTCAATGGTT 

85 



GCCATTCACC CTCAACTTCA CCATCACCAA 
TGCATCACCC AGGCTCCAGG AAGTTCAACA 
GGTCTGCTTG GTCCCATGTT CAAGAACACC 
TGGCTGCAGA CTGACCTTGC TCAGGCCTGA 
GAATGGATGC CATCTGCACC CACCGTCTTG 
NACAGNGAGC NGCTNTACTG GGAGCTNAGC 
NGAGCTGGGN CCCTACACCC TGGACAGGNA 
TCACCCATCN GANCTCTGNG CCCACCACCA 
GTGNACNTNG GNACCTCNGG GACTCCATCC 

NCCNTTCACC NTCAACTTNA CCATCACCAA 
TGCNNCNCCC NGGNTCCAGG AAGTTCAACA 
GGTCTGCTCA AACCCTTGTT CAGGAATAGC 
AGGCTGCAGA CTAGCCTCAC TCAGGCCAGA 
CAGTGGATGC CATCTGCACA CATCGCCCTG 
GACAGAGAGC GACTGTACTG GGAGCTGAGC 
GGAGCTGGGC CCCTACACCC TGGACCGGAA 
TCACCCATCG AAGCTCTATG CCCACCACCA 



TABLE 15 -continued 



CA12 5 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru SEQ ID NO: 145) 



401 GCACTCCTGG GACCTCCACA GTGGATGTGG GAACCTCAGG GACTCCATCC 

10 451 TCCAGCCCCA GCCCCACG 

(SEQ ID NO: 115) 

1 ACTGCTGGCC CTCTCCTGAT ACCATTCACC CTCAACTTCA CCATCACCAA 

15 51 CCTGCAGTAT GGGGAGGACA TGGGTCACCC TGGCTCCAGG AAGTTCAACA 

101 CCACAGAGAG GGTCCTGCAG GGTCTGCTTG GTCCCATATT CAAGAACACC 

^0, 151 AGTGTTGGCC CTCTGTACTC TGGCTGCAGA CTGACCTCTC TCAGGTCTGA 

% 201 GAAGGATGGA GCAGCCACTG GAGTGGATGC CATCTGCATC CATCATCTTG 

% 251 ACCCCAAAAG CCCTGGACTC AACAGAGAGC GGCTGTACTG GGAGCTGAGC 

ill 301 CAACTGACCA ATGGCATCAA AGAGCTGGGC CCCTACACCC TGGACAGGAA 

s 351 CAGTCTCTAT GTCAATGGTT TCACCCATCG GACCTCTGTG CCCACCACCA 

31 401 GCACTCCTGG GACCTCCACA GTGGACCTTG GAACCTCAGG GACTCCATTC 

2 451 TCCCTCCCAA GCCCCGCA 

(SEQ ID NO: 116) 

35 1 ACTGCTGGCC CTCTCCTGGT GCTGTTCACC CTCAACTTCA CCATCACCAA 

51 CCTGAAGTAT GAGGAGGACA TGCATCGCCC TGGCTCCAGG AAGTTCAACA 

101 CCACTGAGAG GGTCCTGCAG ACTCTGCTTG GTCCTATGTT CAAGAACACC 

40 

151 AGTGTTGGCC TTCTGTACTC TGGCTGCAGA CTGACCTTGC TCAGGTCCGA 

201 GAAGGATGGA GCAGCCACTG GAGTGGATGC CATCTGCACC CACCGTCTTG 

45 251 ACCCCAAAAG CCCTGGACTG NACAGNGAGC NGCTNTACTG GGAGCTNAGC 
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TABLE 15 -continued 



CA125 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru SEQ ID NO: 145) 

3 01 CANCTGACCA ANNNCATCNN NGAGCTGGGN CCCTACACCC TGGACAGGNA 
351 CAGTCTCTAT GTCAATGGTT TCACCCATCN GANCTCTGNG CCCACCACCA 

4 01 GCACTCCTGG GACCTCCACA GTGNACNTNG GNACCTCNGG GACTCCATCC 
4 51 TCCNTCCCCN GCCNCACA 

(SEQ ID NO: 117) 

1 NCNNCTGNCC CTCTCCTGNT NCCNTTCACC NTCAACTTNA CCATCACCAA 

51 CCTGCANTAN GNGGAMJACA TGCNNCNCCC NGGNTCCAGG AAGTTCAACA 

101 CCACNGAGAG AGTCCTTCAG GGTCTGCTCA GGCCTGTGTT CAAGAACACC 

151 AGTGTTGGCC CTCTGTACTC TGGCTGCAGA CTGACCTTGC TCAGGCCCAA 

201 GAAGGATGGG GCAGCCACCA AAGTGGATGC CATCTGCACC TACCGCCCTG 

251 ATCCCAAAAG CCCTGGACTG GACAGAGAGC AGCTATACTG GGAGCTGAGC 

3 01 CAGCTAACCC ACAGCATCAC TGAGCTGGGC CCCTACACCC AGGACAGGGA 
351 CAGTCTCTAT GTCAATGGCT TCACCCATCG GAGCTCTGTG CCAACCACCA 

4 01 GTATTCCTGG GACCTCTGCA GTGCACCTGG AAACCACTGG GACTCCATCC 
451 TCCTTCCCCG GCCACACA 

(SEQ ID NO: 118) 

1 GAGCCTGGCC CTCTCCTGAT ACCATTCACT TTCAACTTTA CCATCACCAA 

51 CCTGCGTTAT GAGGAAAACA TGCAACACCC TGGTTCCAGG AAGTTCAACA 

101 CCACGGAGAG GGTTCTGCAG GGTCTGCTCA CGCCCTTGTT CAAGAACACC 

151 AGTGTTGGCC CTCTGTACTC TGGCTGCAGA CTGACCTTGC TCAGACCTGA 

2 01 GAAGCAGGAG GCAGCCACTG GAGTGGACAC CATCTGTACC CACCGCGTTG 
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TABLE 15 -continued 



CA125 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru 145) 

2 51 ATCCCATCGG ACCTGGACTG GACAGAGAGC GGCTATACTG GGAGCTGAGC 

3 01 CAGCTGACCA ACAGCATCAC AGAGCTGGGA CCCTACACCC TGGATAGGGA 
351 CAGTCTCTAT GTCGATGGCT TCAACCCTTG GAGCTCTGTG CCAACCACCA 

4 01 GCACTCCTGG GACCTCCACA GTGCACCTGG CAACCTCTGG GACTCCATCC 
451 CCCCTGCCTG GCCACACA 

(SEQ ID NO: 119) 



1 


GCCCCTGTCC 


CTCTCTTGAT 


ACCATTCACC 


CTCAACTTTA 


CCATCACCGA 


51 


CCTGCATTAT 


GAAGAAAACA 


TGCAACACCC 


TGGTTCCAGG 


AAGTTCAACA 


101 


CCACGGAGAG 


GGTTCTGCAG 


GGTCTGCTCA 


AGCCCTTGTT 


CAAGAGCACC 


151 


AGCGTTGGCC 


CTCTGTACTC 


TGGCTGCAGA 


CTGACCTTGC 


TCAGACCTGA 


201 


GAAACATGGG 


GCAGCCACTG 


GAGTGGACGC 


CATCTGCACC 


CTCCGCCTTG 


251 


ATCCCACTGG 


TCCTGGACTG 


GACAGAGAGC 


GGCTATACTG 


GGAGCTGAGC 


301 


CAGCTGACCA 


ACAGCATCAC 


AGAGCTGGGA 


CCCTACACCC 


TGGATAGGGA 


351 


CAGTCTCTAT 


GTCAATGGCT 


TCAACCCTTG 


GAGCTCTGTG 


CCAACCACCA 


401 


GCACTCCTGG 


GACCTCCACA 


GTGCACCTGG 


CAACCTCTGG 


GACTCCATCC 


451 


TCCCTGCCTG 


GCCACACA 








ID NO: 120) 

1 ACTGCTGGCC 


CTCTCCTGGT 


GCCGTTCACC 


CTCAACTTCA 


CCATCACCAA 


51 


CCTGAAGTAC 


GAGGAGGACA 


TGCATTGCCC 


TGGCTCCAGG 


AAGTTCAACA 


101 


CCACAGAGAG 


AGTCCTGCAG 


AGTCTGCATG 


GTCCCATGTT 


CAAGAACACC 


151 


AGTGTTGGCC 


CTCTGTACTC 


TGGCTGCAGA 


CTGACCTTGC 


TCAGGTCCGA 



TABLE 15 -continued 



CA12 5 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru SEQ ID NO : 14 5) 





201 


GAAGGATGGA 


GCAGCCACTG 


GAGTGGATGC 


CATCTGCACC 


CACCGTCTTG 


10 


251 


AC C C C AAAACj 


/-I /-I pi rp /-I p TV p rn 




NGCTNTACTG 


GGAGCTNAGC 




301 


CANCTGACCA 


ANNNCATCNN 


NGAGCTGGGN 


CCCTACACCC 


TGGACAGGNA 


15 


351 


CAGTCTCTAT 


GTCAATGGTT 


TCACCCATCN 


GANCTCTGNG 


CCCACCACCA 


401 
451 


GCACTCCTGG 
TCCNTCCCCN 


GACCTCCACA 
GCCNCACA 


GTGNACNTNG 


GNACCTCNGG 


GACTCCATCC 


20-1 


(SEQ ID NO: 121) 

1 NCMNCTGNCC 


CTCTCCTGNT 


NCCNTTCACC 


JM 1 L,AAU i liMA. 


PPATrAfTAA 


Ml 


51 


CCTGCANTAN 


GNGGANNACA 


TGCNNCNCCC 




Z\ A r"TTr* A A P A 


25i 


101 


CCACNGAGNG 


NGTNCTGCAG 


GGTCTGCTNN 


NjnCCCJn iJNI 1 1 


HTV TiP A Z\ PMPP 


"rj 


151 


AGTGTNGGCC 


NTCTGTACTC 


TGGCTGCAGA 




TH a P fi'MP'MP A 




201 


GAAGNATGGN 


GCAGCCACTG 


CjAN 1 oUA 1 (crL. 


r* a T p T n A "KT P 


PAPCNNCNTN 


M 

7,. i 


251 


ANCCCAAAAG 


NCC ICjCjAL. Ivj 






GGAGCTNAGC 




301 


CANCTGACCA 


ACAGCATCAC 


AGAGCTGGGA 


CCCTACACCC 


TGGATAGGGA 


35 


351 


CAGTCTCTAT 


GTCAATGGTT 


TCACCCATCG 


AAGCTCTATG 


CCCACCACCA 




401 


GTATTCCTGG 


GACCTCTGCA 


GTGCACCTGG 


AAACCTCTGG 


GACTCCAGCC 


40 


451 


TCCCTCCCTG 


GCCACACA 








(SEQ ID NO: 122) 

1 GCCCCTGGCC 


CTCTCCTGGT 


GCCATTCACC 


CTCAACTTCA 


CTATCACCAA 


45 


51 


CCTGCAGTAT 


GAGGAGGACA 


TGCGTCACCC 


TGGTTCCAGG 


AAGTTCAACA 


101 


CCACGGAGAG 


AGTCCTGCAG 


GGTCTGCTCA 


AGCCCTTGTT 


CAAGAGCACC 
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TABLE 15 -continued 



CA125 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru SEQ ID NO: 145) 





151 


AGTGTTGGCC 


LiLibrlAL-iU 




PTGACCTTGC 


TCAGGCCTGA 


10 


2 01 






GCGTGGACAC 


CATCTGCACT 


CACCGCCTTG 








rrrTHHACTG 

\^ \w J. v_j wr^v-* X \_J 


NACAGNGAGC 


NGCTNTACTG 


GGAGCTNAGC 


15 


301 


CANCTGACCA 


ANWNCATCNN 


NGAGCTGGGN 


CCCTACACCC 


TGGACAGGNA 


351 


CAGTCTCTAT 


GTCAATGGTT 


TCACCCATCN 


GANCTCTGNG 


CCCACCACCA 




401 


GCACTCCTGG 


GACCTCCACA 


GTGNACNTNG 


GNACCTCNGG 


GACTCCATCC 


20, 


451 


TCCNTCCCCN 


GCCNCACA 








^0 


(SEQ ID NO: 123) 

1 NCNNCTGNCC 


CTCTCCTGNT 


■\T/^/^TvTT"T'P A PP 

IMLLJNl i i 


■KIT P Zi Zli P T TKT A 


CCATCACCAA 




51 


CCTGCANTAN 


GNGGANNACA 




"MP pisjT p p A 


A AHTTCAACA 




101 


CCACNGAGNG 


NGTNCTGCAG 


T" P T" P P 'y'NTNT 


"NTM P P PNTTNT T 

XNiNVwV^V^iN XIN X X 


CAAGAACNCC 




151 


AGTGTNGGCC 


NTCTGTACTC 


rpOi^ r^TT^ A A 

i CjCjL. 1 (jL-AvjA 


PTPA PPTKTKTP 


TPAGGNCNGA 


3i 


201 


GAAGNATGGN 


(^ri A PPP A PTP 


P KTT(^(^ Z\ T(^P 


CATCTGCANC 


CACCNNCNTN 




251 


ANCCCAAAAG 


"NTPPTPPZi PTn 




NGCTNTACTG 


GGAGCTNAGC 


35 


301 


CANCTGACCA 


ANNNCATCNN 


NGAGCTGGGN 


CCCTACACCC 


TGGACAGGNA 




351 


CAGTCTCTAT 


GTCAATGGTT 


TTCACCCTCG 


GAGCTCTGTG 


CCAACCACCA 


40 


401 


GCACTCCTGG 


GACCTCCACA 


GTGCACCTGG 


CAACCTCTGG 


GACTCCATCC 


451 


TCCCTGCCTG 


GCCACACA 








45 


(SEQ ID NO: 124) 

1 GCCCCTGTCC 


CTCTCTTGAT 


ACCATTCACC 


CTCAACTTTA 


CCATCACCAA 


51 


CCTGCATTAT 


GAAGAAAACA 


TGCAACACCC 


TGGTTCCAGG 


AAGTTCAACA 
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TABLE 15 -continued 



CA125 Repeat Nucleotide Sequence 
5 (SEQ ID NO: 83 thru SEQ ID NO: 145) 





101 


CCACGGAGCG 


GGTCCTGCAG 


GGTCTGCTTG 


GTCCCATGTT 


CAAGAACACA 


10 


151 


AGTGTCGGCC 


TTCTGTACTC 


TGGCTGCAGA 


CTGACCTTGC 


TCAGGCCTGA 




201 


GAAGAATGGG 


GCAGCCACTG 


GAATGGATGC 


CATCTGCACjC 


LAOCLr 1 itj 


15 


251 


ACCCCAAAAG 


CCCTGGACTG 


NACAGNGAGC 


NGCTNTAL i Cj 




301 


CANCTGACCA 


ANNNCATCNN 


NGAGCTGGGN 


CCCTACACCC 


TGGACAGGNA 




351 


CAGTCTCTAT 


GTCAATGGTT 


TCACCCATCN 


ganctctgng 


CCCACCACCA 


20 


401 


GCACTCCTGG 


GACCTCCACA 


GTGNACNTNG 


GNACCTCNGG 


GACTCCATCC 




451 


TCCNTCCCCN 


GCCNCACA 








I n 


(SEQ ID NO: 125) 

1 NCNNCTGNCC 


CTCTCCTGNT 


NCCNTTCACC 


NTCAACTTNA 


CCATCACCAA 


; ,. I 


51 


CCTGCANTAN 


GNGGANNACA 


TGCNNCNCCC 


NGGNTCCAGG 


AAGTTCAACA 




101 


CCACNGAGNG 


NGTNCTGCAG 


GGTCTGCTNN 


NNCCCNTNTT 


CAAGAACNCC 


3§ 

\U 


151 


AGTGTNGGCC 


NTCTGTACTC 


TGGCTGCAGA 


CTGACCTNNC 


TCAGGNCNGA 




201 


GAAGNATGGN 


GCAGCCACTG 


GANTGGATGC 


CATCTGCANC 


CACCNNCNTN 


35 


251 


ANCCCAAAAG 


NCCTGGACTG 


NACAGNGAGC 


NGCTNTACTG 


GGAGCTNAGC 




301 


CANCTGACCA 


ANNNCATCNN 


NGAGCTGGGN 


CCCTACACCC 


TGGACAGGNA 


40 


351 


CAGTCTCTAT 


GTCAATGGTT 


TCACCCATCA 


GAACTCTGTG 


CCCACCACCA 


401 


GTACTCCTGG 


GACCTCCACA 


GTGTACTGGG 


CAACCACTGG 


GACTCCATCC 



451 TCCTTCCCCG GCCACACA 

45 
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TABLE 15 -continued 



CA125 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru SEQ ID NO: 145) 



(SEQ ID NO: 126) 

1 GAGCCTGGCC CTCTCCTGAT 

51 CCTGCATTAT GAGGAAAACA 

101 CCACGGAGAG GGTTCTGCAG 

151 AGTGTTGGCC CTCTGTACTC 

2 01 GAAGCAGGAG GCAGCCACTG 
251 ATCCCATCGG ACCTGGACTG 

3 01 CANCTGACCA ANNNCATCNN 
351 CAGTCTCTAT GTCAATGGTT 

4 01 GCACTCCTGG GACCTCCACA 
451 TCCNTCCCCN GCCNCACA 

(SEQ ID NO: 127) 

1 NCNNCTGNCC CTCTCCTGNT 

51 CCTGCANTAN GNGGANNACA 

101 CCACNGAGNG NGTNCTGCAG 

151 AGTGTNGGCC NTCTGTACTC 

2 01 GAAGNATGGN GCAGCCACTG 
251 ANCCCAAAAG NCCTGGACTG 
301 CANCTGACCA ANNNCATCNN 

3 51 CAGTCTCTAT GTCAATGGTT 
401 GCAGTCCTGG GACCTCCACA 



ACCATTCACT TTCAACTTTA CCATCACCAA 
TGCAACACCC TGGTTCCAGG AAGTTCAACA 
GGTCTGCTCA CGCCCTTGTT CAAGAACACC 
TGGCTGCAGA CTGACCTTGC TCAGACCTGA 
GAGTGGACAC CATCTGTACC CACCGCGTTG 
NACAGNGAGC NGCTNTACTG GGAGCTNAGC 
NGAGCTGGGN CCCTACACCC TGGACAGGNA 
TCACCCATCN GANCTCTGNG CCCACCACCA 
GTGNACNTNG GNACCTCNGG GACTCCATCC 

NCCNTTCACC NTCAACTTNA CCATCACCAA 
TGCNNCNCCC NGGNTCCAGG AAGTTCAACA 
GGTCTGCTNN NNCCCNTNTT CAAGAACNCC 
TGGCTGCAGA CTGACCTNNC TCAGGNCNGA 
GANTGGATGC CATCTGCANC CACCNNCNTN 
NACAGNGAGC NGCTNTACTG GGAGCTNAGC 
NGAGCTGGGN CCCTACACCC TGGACAGGNA 
TCACCCATCG GAGCTCTGTG CCAACCACCA 
GTGCACCTGG CAACCTCTGG GACTCCATCC 
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TABLE 15-continued 



3r 



45 



CA125 Repeat Nucleotide Sequence 
(SEQ ID NO; 83 thru SEQ ID NO: 145) 



451 TCCCTGCCTG GCCACACA 

10 (SEQ ID NO: 128) 

1 GCCCCTGTCC CTCTCTTGAT ACCATTCACC CTCAACTTTA CCATCACCAA 

51 CCTGCATTAT GAAGAAAACA TGCAACACCC TGGTTCCAGG AAGTTCAACA 

15 101 CCACGGAGAG GGTTCTGCAG GGTCTGCTCA AGCCCTTGTT CAAGAGCACC 

151 AGTGTTGGCC CTCTGTACTC TGGCTGCAGA CTGACCTTGC TCAGACCTGA 

201 GAAACATGGG GCAGCCACTG GAGTGGACGC CATCTGCACC CTCCGCCTTG 

20 

251 ATCCCACTGG TCCTGGACTG NACAGNGAGC NGCTNTACTG GGAGCTNAGC 

[p. 301 CANCTGACCA ANNNCATCNN NGAGCTGGGN CCCTACACCC TGGACAGGNA 

351 CAGTCTCTAT GTCAATGGTT TCACCCATCN GANCTCTGNG CCCACCACCA 

401 GCACTCCTGG GACCTCCACA GTGNACNTNG GNACCTCNGG GACTCCATCC 

451 TCCNTCCCCN GCCNCACA 



(SEQ ID NO: 129) 

1 NCNNCTGNCC CTCTCCTGNT NCCNTTCACC NTCAACTTNA CCATCACCAA 



51 CCTGCANTAN GNGGANNACA TGCNNCNCCC NGGNTCCAGG AAGTTCAACA 

101 CCACNGAGNG NGTNCTGCAG GGTCTGCTNN NNCCCNTNTT CAAGAACNCC 

151 AGTGTNGGCC NTCTGTACTC TGGCTGCAGA CTGACCTNNC TCAGGNCNGA 

40 2 01 GAAGNATGGN GCAGCCACTG GANTGGATGC CATCTGCANC CACCNNCNTN 

251 ANCCCAAAAG NCCTGGACTG NACAGNGAGC NGCTNTACTG GGAGCTNAGC 

3 01 CANCTGACCA ANNNCATCNN NGAGCTGGGN CCCTACACCC TGGACAGGNA 

351 CAGTCTCTAT GTCAATGGTT TCACCCATCG GACCTCTGTG CCCACCACCA 
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TABLE 15 -continued 



CA125 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru SEQ ID NO: 145) 

4 01 GCACTCCTGG GACCTCCACA GTGCACCTGG CAACCTCTGG GACTCCATCC 
451 TCCCTGCCTG GCCACACA 

(SEQ ID NO: 130) 

1 GCCCCTGTCC CTCTCTTGAT ACCATTCACC CTCAACTTTA CCATCACCAA 

51 CCTGCAGTAT GAGGAGGACA TGCATCGCCC TGGATCTAGG AAGTTCAACA 

101 CCACAGAGAG GGTCCTGCAG GGTCTGCTTA GTCCCATTTT CAAGAACTCC 

151 AGTGTTGGCC CTCTGTACTC TGGCTGCAGA CTGACCTCTC TCAGGCCCGA 

201 GAAGGATGGG GCAGCAACTG GAATGGATGC TGTCTGCCTC TACCACCCTA 

251 ATCCCAAAAG ACCTGGACTG NACAGNGAGC NGCTNTACTG GGAGCTMAGC 

3 01 CANCTGACCA ANNNCATCNN NGAGCTGGGN CCCTACACCC TGGACAGGNA 

3 51 CAGTCTCTAT GTCAATGGTT TCACCCATCN GANCTCTGNG CCCACCACCA 

4 01 GCACTCCTGG GACCTCCACA GTGNACNTNG GNACCTCNGG GACTCCATCC 
451 TCCNTCCCCN GCCNCACA 

(SEQ ID NO: 131) 

1 NCNNCTGNCC CTCTCCTGNT NCCNTTCACC NTCAACTTNA CCATCACCAA 

51 CCTGCANTAN GNGGANNACA TGCNNCNCCC NGGNTCCAGG AAGTTCAACA 

101 CCACNGAGNG NGTNCTGCAG GGTCTGCTNN NNCCCNTNTT CAAGAACNCC 

151 AGTGTNGGCC NTCTGTACTC TGGCTGCAGA CTGACCTNNC TCAGGNCNGA 

2 01 GAAGNATGGN GCAGCCACTG GANTGGATGC CATCTGCANC CACCNNCNTN 

2 51 ANCCCAAAAG NCCTGGACTG NACAGNGAGC NGCTNTACTG GGAGCTNAGC 

3 01 CANCTGACCA ANNNCATCNN NGAGCTGGGN CCCTACACCC TGGACAGGNA 
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TABLE 15 -continued 



CA12 5 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru SEQ ID NO: 145) 





351 


CAGTCTCTAT 


GTCAATGGTT 


TCACCCATTG 


GAGCTCTGGG 


CTCACCACCA 


10 


401 
451 


GCACTCCTTG 
CCCGTCCCCA 


GACTTCCACA 
GCCCCACA 


GTTGACCTTG 


GAACCTCAGG 


GACTCCATCC 


15 


(SEQ ID NO: 132) 

1 ACTGCTGGCC 


CTCTCCTGGT 




PTA A A PT'T'P A 
^ 1 /\/\AL, 1 1 U A 


L LA 1 LAC CAA 




51 


CCTGCAGTAT 


GAGGAGGACA 




1 o^A 1 U 1 Atjtsj 


AACj ttcaacg 


20 


101 


CCACAGAGAG 


GGTCCTGCAfJ 




PT'P/^/^ 7\ m^y rnrp 
VJ1L.L.CA1A1 1 


caagaacacc 




151 


AGTGTTGGCC 






ILrALLl ICjC 


tcagacctga 




201 


GAAGCAGGAG 






LATCTGTACC 


caccgcgttg 




251 


ATCCCATCGG 


ACCTGGACTG 


NACAGNGAGC 


NGCTNTACTG 


GGAGCTNAGC 




301 


CANCTGACCA 


ANNNCATCNN 


NGAGCTGGGN 


CCCTACACCC 


TGGACAGGNA 




351 


CAGTCTCTAT 


GTCAATGGTT 


TCACCCATCN 


GANCTCTGNG 


CCCACCACCA 




401 


GCACTCCTGG 


GACCTCCACA 


GTGMACNTNG 


GNACCTCNGG 


GACTCCATCC 




451 


TCCNTCCCCN 


GCCMCACA 








35 


(SEQ ID NO: 133) 

1 NCNNCTGNCC 






JM 1 LAACTTNA 


C CATC AC CAA 




51 


CCTGCANTAN 


GNGGANNACA 


TGCNNCNCCC 


NGGNTCCAGG 


AAGTTCAACA 


40 


101 


CCACNGAGNG 


NGTNCTGCAG 


GGTCTGCTNN 


NNCCCNTNTT 


CAAGAACNCC 




151 


AGTGTNGGCC 


NTCTGTACTC 


TGGCTGCAGA 


CTGACCTNNC 


TCAGGNCNGA 


45 


201 


GAAGNATGGN 


GCAGCCACTG 


GANTGGATGC 


CATCTGCANC 


CACCNNCNTN 




251 


ANCCCAAAAG 


NCCTGGACTG 


NACAGNGAGC 


NGCTNTACTG 


GGAGCTNAGC 



TABLE 15 -continued 



CA125 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru SEQ ID NO: 145) 

301 CANCTGACCA AISINNCATCNN NGAGCTGGGN CCCTACACCC TGGACAGGNA 

3 51 CAGTCTCTAT GTCAATGGTT TCACCCATCG GAGCTTTGGG CTCACCACCA 

4 01 GCACTCCTTG GACTTCCACA GTTGACCTTG GAACCTCAGG GACTCCATCC 
451 CCCGTCCCCA GCCCCACA 

(SEQ ID NO: 134) 

1 ACTGCTGGCC CTCTCCTGGT GCCATTCACC CTAAACTTCA CCATCACCAA 

51 CCTGCAGTAT GAGGAGGACA TGCATCGCCC TGGCTCCAGG AAGTTCAACA 

101 CCACGGAGAG GGTCCTTCAG GGTCTGCTTA CGCCCTTGTT CAGGAACACC 

151 AGTGTCAGCT CTCTGTACTC TGGTTGCAGA CTGACCTTGC TCAGGCCTGA 

2 01 GAAGGATGGG GCAGCCACCA GAGTGGATGC TGTCTGCACC CATCGTCCTG 

2 51 ACCCCAAAAG CCCTGGACTG NACAGNGAGC NGCTNTACTG GGAGCTNAGC 

3 01 CANCTGACCA ANNNCATCNN NGAGCTGGGN CCCTACACCC TGGACAGGNA 

3 51 CAGTCTCTAT GTCAATGGTT TCACCCATCN GANCTCTGNG CCCACCACCA 

4 01 GCACTCCTGG GACCTCCACA GTGNACNTNG GNACCTCNGG GACTCCATCC 
451 TCCNTCCCCN GCCNCACA 

(SEQ ID NO: 135) 

1 NCNNCTGNCC CTCTCCTGNT NCCNTTCACC NTCAACTTNA CCATCACCAA 

51 CCTGCANTAN GNGGANNACA TGCNNCNCCC NGGNTCCAGG AAGTTCAACA 

101 CCACNGAGNG NGTNCTGCAG GGTCTGCTNN NNCCCNTNTT CAAGAACNCC 

151 AGTGTNGGCC NTCTGTACTC TGGCTGCAGA CTGACCTNNC TCAGGNCNGA 

2 01 GAAGNATGGN GCAGCCACTG GANTGGATGC CATCTGCANC CACCNNCNTN 
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TABLE 15 -continued 



CA125 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru SEQ ID NO: 145) 



251 ANCCCAAAAG NCCTGGACTG 

3 01 CANCTGACCA ANNNCATCNN 

351 CAGTCTCTAT GTCAATGGTT 

401 GCACTCCTGG GACCTCCACA 

451 CTCCCCAGCC CCACA 

(SEQ ID NO: 136) 

1 ACTGCTGGCC CTCTCCTGGT 

51 CCTGCAGTAT GGGGAGGACA 

101 CCACAGAGAG GGTCCTGCAG 

151 AGTGTTGGCC CTCTGTACTC 

2 01 GAAGGATGGA GCAGCCACTG 
251 ACCCCAAAAG CCCTGGACTG 

3 01 CANCTGACCA AJSliNNCATCNN 
351 CAGTCTCTAT GTCAATGGTT 
401 GCACTCCTGG GACCTCCACA 
451 TCCNTCCCCN GCCNCACA 

(SEQ ID NO: 137) 

1 NCNNCTGNCC CTCTCCTGNT 

51 CCTGGANTAN GNGGANNACA 

101 CCACNGAGNG NGTNCTGCAG 

151 AGTGTNGGCC NTCTGTACTC 

97 



NACAGNGAGC NGCTNTACTG GGAGCTNAGC 
NGAGCTGGGN CCCTACACCC TGGACAGGNA 
TCACCCATTG GATCCCTGTG CCCACCAGCA 
GTGGACCTTG GGTCAGGGAC TCCATCCTCC 

ACCATTCACC CTCAACTTCA CCATCACCAA 
TGGGTCACCC TGGCTCCAGG AAGTTCAACA 
GGTCTGCTTG GTCCCATATT CAAGAACACC 
TGGCTGCAGA CTGACCTCTC TCAGGTCCGA 
GAGTGGATGC CATCTGCATC CATCATCTTG 
NACAGNGAGC NGCTNTACTG GGAGCTNAGC 
NGAGCTGGGN CCCTACACCC TGGACAGGNA 
TCACCCATCN GANCTCTGNG CCCACCACCA 
GTGNACNTNG GNACCTCNGG GACTCCATCC 

NCCNTTCACC NTCAACTTNA CCATCACCAA 
TGCNNCNCCC NGGNTCCAGG AAGTTCAACA 
GGTCTGCTNN NNCCCNTNTT CAAGAACNCC 
TGGCTGCAGA CTGACCTNNC TCAGGNCNGA 



TABLE 15 -continued 



CA125 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru SEQ ID NO: 145) 

201 GAAGNATGGN GCAGCCACTG GANTGGATGC CATCTGCANC CACCNNCNTN 

251 ANCCCAAAAG NCCTGGACTG NACAGNGAGC NGCTNTACTG GGAGCTNAGC 

3 01 CANCTGACCA AMNNCATCNN NGAGCTGGGN CCCTACACCC TGGACAGGNA 
351 CAGTCTCTAT GTCAATGGTT TCACCCATCA GACCTTTGCG CCCAACACCA 

4 01 GCACTCCTGG GACCTCCACA GTGGACCTTG GGACCTCAGG GACTCCATCC 
451 TCCCTCCCC AGCCCTACA 

(SEQ ID NO: 138) 

1 TCTGCTGGCC CTCTCCTGGT GCCATTCACC CTCAACTTCA CCATCACCAA 

51 CCTGCAGTAC GAGGAGGACA TGCATCACCC AGGCTCCAGG AAGTTCAACA 

101 CCACGGAGCG GGTCCTGCAG GGTCTGCTTG GTCCCATGTT CAAGAACACC 

151 AGTGTCGGCC TTCTGTACTC TGGCTGCAGA CTGACCTTGC TCAGGCCTGA 

201 GAAGAATGGG GCAGCCACCA GAGTGGATGC TGTCTGCACC CATCGTCCTG 

251 ACCCCAAAAG CCCTGGACTG NACAGNGAGC NGCTNTACTG GGAGCTNAGC 

3 01 CANCTGACCA ANNNCATCNN NGAGCTGGGN CCCTACACCC TGGACAGGNA 
351 CAGTCTCTAT GTCAATGGTT TCACCCATCN GANCTCTGNG CCCACCACCA 

4 01 GCACTCCTGG GACCTCCACA GTGNACNTNG GNACCTCNGG GACTCCATCC 
451 TCCNTCCCCN GCCNCACA 

(SEQ ID NO: 13 9) 

1 NCNNCTGNCC CTCTCCTGNT NCCNTTCACC NTCAACTTNA CCATCACCAA 

51 CCTGCANTAN GNGGANNACA TGCNNCNCCC NGGNTCCAGG AAGTTCAACA 
101 CCACNGAGAG GGTTCTGCAG GGTCTGCTCA AGCCCTTGTT CAAGAGCACC 
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TABLE 15 -continued 



CA125 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru SEQ ID NO: 145) 

151 AGTGTTGGCC CTCTGTATTC TGGCTGCAGA CTGACCTTGC TCAGGCCTGA 

201 GAAGGACGGA GTAGCCACCA GAGTGGACGC CATCTGCACC CACCGCCCTG 

251 ACCCCAAAAT CCCTGGGCTA GACAGACAGC AGCTATACTG GGAGCTGAGC 

301 CAGCTGACCC ACAGCATCAC TGAGCTGGGA CCCTACACCC TGGATAGGGA 

351 CAGTCTCTAT GTCAATGGTT TCACCCAGCG GAGCTCTGTG CCCACCACCA 

401 GCACTCCTGG GACTTTCACA GTACAGCCGG AAACCTCTGA GACTCCATCA 

451 TCCCTCCCTG GCCCCACA 

(SEQ ID NO: 140) 

1 GCCACTGGCC CTGTCCTGCT GCCATTCACC CTCAATTTTA CCATCACTAA 

51 CCTGCAGTAT GAGGAGGACA TGCATCGCCC TGGCTCCAGG AAGTTCAACA 

101 CCACGGAGAG GGTCCTTCAG GGTCTGCTTA TGCCCTTGTT CAAGAACACC 

151 AGTGTCAGCT CTCTGTACTC TGGTTGCAGA CTGACCTTGC TCAGGCCTGA 

2 01 GAAGGATGGG GCAGCCACCA GAGTGGATGC TGTCTGCACC CATCGTCCTG 

251 ACCCCAAAAG CCCTGGACTG GACAGAGAGC GGCTGTACTG GAAGCTGAGC 

301 CAGCTGACCC ACGGCATCAC TGAGCTGGGC CCCTACACCC TGGACAGGCA 

351 CAGTCTCTAT GTCAATGGTT TCACCCATCA GAGCTCTATG ACGACCACCA 

401 GAACTCCTGA TACCTCCACA ATGCACCTGG CAACCTCGAG AACTCCAGCC 

451 TCCCTGTCTG GACCTACG 

(SEQ ID NO: 141) 

1 ACCGCCAGCC CTCTCCTGGT GCTATTCACA ATTAACTTCA CCATCACTAA 

51 CCTGCGGTAT GAGGAGAACA TGCATCACCC TGGCTCTAGA AAGTTTAACA 
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TABLE 15 -continued 



CA125 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru SEQ ID NO: 145) 

101 CCACGGAGAG AGTCCTTCAG GGTCTGCTCA GGCCTGTGTT CAAGAACACC 

151 AGTGTTGGCC CTCTGTACTC TGGCTGCAGA CTGACCTTGC TCAGGCCCAA 

2 01 GAAGGATGGG GCAGCCACCA AAGTGGATGC CATCTGCACC TACCGCCCTG 

251 ATCCCAAAAG CCCTGGACTG GACAGAGAGC AGCTATACTG GGAGCTGAGC 

301 CAGCTAACCC ACAGCATCAC TGAGCTGGGC CCCTACACCC TGGACAGGGA 

351 CAGTCTCTAT GTCAATGGTT TCACACAGCG GAGCTCTGTG CCCACCACTA 

401 GCATTCCTGG GACCCCCACA GTGGACCTGG GAACATCTGG GACTCCAGTT 

451 TCTAAACCTG GTCCCTCG 

(SEQ ID NO: 142) 

1 GCTGCCAGCC CTCTCCTGGT GCTATTCACT CTCAACTTCA CCATCACCAA 

51 CCTGCGGTAT GAGGAGAACA TGCAGCACCC TGGCTCCAGG AAGTTCAACA 

101 CCACGGAGAG GGTCCTTCAG GGCCTGCTCA GGTCCCTGTT CAAGAGCACC 

151 AGTGTTGGCC CTCTGTACTC TGGCTGCAGA CTGACTTTGC TCAGGCCTGA 

2 01 AAAGGATGGG ACAGCCACTG GAGTGGATGC CATCTGCACC CACCACCCTG 

2 51 ACCCCAAAAG CCCTAGGCTG GACAGAGAGC AGCTGTATTG GGAGCTGAGC 

301 CAGCTGACCC ACAATATCAC TGAGCTGGGC CACTATGCCC TGGACAACGA 

351 CAGCCTCTTT GTCAATGGTT TCACTCATCG GAGCTCTGTG TCCACCACCA 

4 01 GCACTCCTGG GACCCCCACA GTGTATCTGG GAGCATCTAA GACTCCAGCC 

451 TCGATATTTG GCCCTTCA 
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TABLE 15 -continued 



10 



CA125 Repeat Nucleotide Sequence 
(SEQ ID NO: 83 thru SEQ ID NO: 145) 



(SEQ ID NO: 143) 

1 GCTGCCAGCC ATCTCCTGAT ACTATTCACC CTCAACTTCA CCATCACTAA 



51 CCTGCGGTAT GAGGAGAACA TGTGGCCTGG CTCCAGGAAG TTCAACACTA 

101 CAGAGAGGGT CCTTCAGGGC CTGCTAAGGC CCTTGTTCAA GAACACCAGT 

15 151 GTTGGCCCTC TGTACTCTGG CTCCAGGCTG ACCTTGCTCA GGCCAGAGAA 

2 01 AGATGGGGAA GCCACCGGAG TGGATGCCAT CTGCACCCAC CGCCCTGACC 

2 51 CCACAGGCCC TGGGCTGGAC AGAGAGCAGC TGTATTTGGA GCTGAGCCAG 

m 

% 3 01 CTGACCCACA GCATCACTGA GCTGGGCCCC TACACACTGG ACAGGGACAG 

351 TCTCTATGTC AATGGTTTCA CCCATCGGAG CTCTGTACCC ACCACCAGC 

in 

2|i (SEQ ID NO: 144) 

ijij 1 ACCGGGGTGG TCAGCGAGGA GCCATTCACA CTGAACTTCA CCATCAACAA 

i' 51 CCTGCGCTAC ATGGCGGACA TGGGCCAACC CGGCTCCCTC AAGTTCAACA 

3§ 101 TCACAGACAA CGTCATGAAG CACCTGCTCA GTCCTTTGTT CCAGAGGAGC 

151 AGCCTGGGTG CACGGTACAC AGGCTGCAGG GTCATCGCAC TAAGGTCTGT 

2 01 GAAGAACGGT GCTGAGACAC GGGTGGACCT CCTCTGCACC TACCTGCAGG 

35 , 

251 CCCTCAGCGG CCCAGGTCTG CCTATCAAGC AGGTGTTCCA TGAGCTGAGC 

301 CAGCAGACCC ATGGCATCAC CCGGCTGGGC CCCTACTCTC TGGACAAAGA 

40 351 CAGCCTCTAC CTTAACGGTT ACAATGAACC TGGTCTAGAT GAGCCTCCTA 

4 01 CAACTCCCAA GCCAGCCACC ACATTCCTGC CTCCTCTGTC AGAAGCCACA 

451 ACA 



45 
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TABLE 15 -continued 



CA125 Repeat Nucleotide Sequence 
5 (SEQ ID NO: 83 thru SEQ ID NO: 145) 



(SEQ ID NO: 145) 



10 


1 


GCCATGGGGT 


ACCACCTGAA 


GACCCTCACA 


CTCAACTTCA 


CCATCTCCAA 


51 


TCTCCAGTAT 


TCACCAGATA 


TGGGCAAGGG 


CTCAGCTACA 


mm/~i7\ 7\ rn /~i 7\ 

TTCAACTCCA 




101 


CCGAGGGGGT 


CCTTCAGCAC 


CTGCTCAGAC 


CCTTGTTCCA 


GAAGAGCAGC 


15 


151 


ATGGGCCCCT 


TCTACTTGGG 


TTGCCAACTG 


ATCTCCCTCA 


GGCCTGAGAA 




201 


GGATGGGGCA 


GCCACTGGTG 


TGGACACCAC 


CTGCACCTAC 


CACCCTGACC 




251 


CTGTGGGCCC 


CGGGCTGGAC 


ATACAGCAGC 


TTTACTGGGA 


GCTGAGTCAG 




301 


CTGACCCATG 


GTGTCACCCA 


ACTGGGCTTC 


TATGTCCTGG 


ACAGGGATAG 


m 

ifl 


351 


CCTCTTCATC 


AATGGCTATG 


CACCCCAGAA 


TTTATCAATC 


CGGGGCGAGT 




401 


ACCAGATAAA 


TTTCCACATT 


GTCAACTGGA 


ACCTCAGTAA 


TCCAGACCCC 




451 


ACATCCTCAG 


AGTAC 
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oxffixSxffixaxqxqxoxoxoe^f^t^c^o 

)_1)H)-llHIHHMMMHi-lMMl-ll-i)-llHl-ll-IMMM«-i(-l 



Eh Eh 



EnHEHEHHHHHHHIr^t^HlritHtHtHtHpt^tHeH.-. 
EhhHHHHEhHHHHEhHHHEhHEhEhEhEhEhEhEh 

ox>xSx>x>xgxgxqx9Xoc5Mw&- 
(i4X04XftxqjXP.x<x<c><<:x^iXH<rf<o 

<x;rf;XKX<<X<XEHXBXEHXCQX<H*^<E-" 
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TABLE 17 



Carboxy Terminal Nucleotide Sequence 
(SEQ ID NOs 147) 



15 



T5 



1 GCCATGGGGT ACCACCTGAA GACCCTCACA CTCAACTTCA CCATCTCCAA 

10 51 TCTCCAGTAT TCACCAGATA TGGGCAAGGG CTCAGCTACA TTCAACTCCA 

101 CCGAGGGGGT CCTTCAGCAC CTGCTCAGAC CCTTGTTCCA GAAGAGCAGC 

151 ATGGGCCCCT TCTACTTGGG TTGCCAACTG ATCTCCCTCA GGCCTGAGAA 

201 GGATGGGGCA GCCACTGGTG TGGACACCAC CTGCACCTAC CACCCTGACC 

251 CTGTGGGCCC CGGGCTGGAC ATACAGCAGC TTTACTGGGA GCTGAGTCAG 

3 01 CTGACCCATG GTGTCACCCA ACTGGGCTTC TATGTCCTGG ACAGGGATAG 

351 CCTCTTCATC AATGGCTATG CACCCCAGAA TTTATCAATC CGGGGCGAGT 

401 ACCAGATAAA TTTCCACATT GTCAACTGGA ACCTCAGTAA TCCAGACCCC 

451 ACATCCTCAG AGTACATCAC CCTGCTGAGG GACATCCAGG ACAAGGTCAC 

501 CACACTCTAC AAAGGCAGTC AACTACATGA CACATTCCGC TTCTGCCTGG 

551 TCACCAACTT GACGATGGAC TCCGTGTTGG TCACTGTCAA GGCATTGTTC 

6 01 TCCTCCAATT TGGACCCCAG CCTGGTGGAG CAAGTCTTTC TAGATAAGAC 

651 CCTGAATGCC TCATTCCATT GGCTGGGCTC CACCTACCAG TTGGTGGACA 

701 TCCATGTGAC AGAAATGGAG TCATCAGTTT ATCAACCAAC AAGCAGCTCC 

751 AGCACCCAGC ACTTCTACCT GAATTTCACC ATCACCAACC TACCATATTC 

40 801 CCAGGACAAA GCCCAGCCAG GCACCACCAA TTACCAGAGG AACAAAAGGA 

851 ATATTGAGGA TGCGCTCAAC CAACTCTTCC GAAACAGCAG CATCAAGAGT 

901 TATTTTTCTG ACTGTCAAGT TTCAACATTC AGGTCTGTCC CCAACAGGCA 
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TABLE 17 -continued 



Carboxy Terminal Nucleotide Sequence 
(SEQ ID NO: 147) 

951 CCACACCGGG GTGGACTCCC TGTGTAACTT CTCGCCACTG GCTCGGAGAG 
1001 TAGACAGAGT TGCCATCTAT GAGGAATTTC TGCGGATGAC CCGGAATGGT 
1051 ACCCAGCTGC AGAACTTCAC CCTGGACAGG AGCAGTGTCC TTGTGGATGG 
1101 GTATTCTCCC AACAGAAATG AGCCCTTAAC TGGGAATTCT GACCTTCCCT 
1151 TCTGGGCTGT CATCCTCATC GGCTTGGCAG GACTCCTGGG ACTCATCACA 
1201 TGCCTGATCT GCGGTGTCCT GGTGACCACC CGCCGGCGGA AGAAGGAAGG 
1251 AGAATACAAC GTCCAGCAAC AGTGCCCAGG CTACTACCAG TCACACCTAG 
1301 ACCTGGAGGA TCTGCAATGA CTGGAACTTG CCGGTGCCTG GGGTGCCTTT 
1351 CCCCCAGCCA GGGTCCAAAG AAGCTTGGCT GGGGCAGAAA TAAACCATAT 
1401 TGGTCGGAAA AAAAAAAAAA AA 
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TABLE 18 



Carboxy Terminal Amino Acid Sequence 



5 






(SEQ ID NO: 148) 








1 


AMGYHLKTLT 


LNFTISNLQY SPDMGKGSAT 


FNSTEGVLQH 


LLRPLFQKSS 




51 


MGPFYLGCQL 


ISLRPEKDGA ATGVDTTCTY 


HPDPVGPGLD 


IQQLYWELSQ 


10 


101 


LTHGVTQLGF 


YVLDRDSLFI NGYAPQNLSI 


RGEYQINFHI 


VNWNLSNPDP 




151 


* 

TSSEYITLLR 


DIQDKVTTLY KGSQLHDTFR 


FCLVTNLTMD 


SVLVTVKALF 


15 


201 


SSNLDPSLVE 


QVFLDKTLNA SFHWLGSTYQ 


LVDIHVTEME 


SSVYQPTSSS 




251 


STQHFYLNFT 


ITNLPYSQDK AQPGTTNYQR NKRNIEDALN QLFRNSSIKS 




301 


YFSDCQVSTF 


RSVPNRHHTG VDSLCNFSPL 


ARRVDRVAIY 


EEFLRMTRNG 


is 


351 


TQLQNFTLDR 


SSVLVDGYSP NRNEPLTGNS 


DLPFWAVILI 


GLAGLLGLIT 




401 


CLICGVLVTT 


RRRKKEGEYN VQQQCPGYYQ 


SHLDLEDLQ 





2fi 
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TABLE 19 A 



Serine/Threonine 0-glycosylation Pattern Predicted for the 
Amino Terminal End of the CA12 5 Molecule 
(SEQ ID NO: 149) 



SEQ ID NO: 149 Length: 1799 

10 RTDGIMEHITKIPNEAAHRGTIRPVKGPQTSTSPASPKGLHTGGTKRMETTTTALKTTTTALKTTSRATLTTSVYTPTLG 80 

TLTPLNASRQMASTILTEMMITTPYVFPDVPETTSSLATSLGAETSTALPRTTPSVLNRESETTASLVSRSGAERSPVIQ 16 0 

TLDVSSSEPDTTASWVIHPAETIPTVSKTTPNFFHSELDTVSSTATSHGADVSSAIPTNISPSELDALTPLVTISGTDTS 240 

TTFPTLTKSPHETETRTTWLTHPAETSSTIPRTIPNFSHHESDATPSIATSPGAETSSAIPIMTVSPGAEDLVTSQVTSS 320 

GTDRNMTIPTLTLSPGEPKTIASLVTHPEAQTSSAIPTSTISPAVSRLVTSMVTSLAAKTSTTNRALTNSPGEPATTVSL 400 

15 VTHPAQTSPTVPWTTSIFFHSKSDTTPSMTTSHGAESSSAVPTPTVSTEVPGWTPLVTSSRAVISTTIPILTLSPGEPE 480 

TTPSMATSHGEEASSAIPTPTVSPGVPGWTSLVTSSRAVTSTTIPILTFSLGEPETTPSMATSHGTEAGSAVPTVLPEV 560 

PGMVTSLVASSRAVTSTTLPTLTLSPGEPETTPSMATSHGAEASSTVPTVSPEVPGWTSLVTSSSGVNSTSIPTLILSP 64 0 

GELETTPSMATSHGAEASSAVPTPTVSPGVSGWTPLVTSSRAVTSTTIPILTLSSSEPETTPSMATSHGVEASSAVLTV 72 0 

SPEVPGMVTSLVTSSRAVTSTTIPTLTISSDEPETTTSLVTHSEAKMISAIPTLAVSPTVQGLVTSLVTSSGSETSAFSN 800 

2p] LTVASSQPETIDSWVAHPGTEASSWPTLTVSTGEPFTNISLVTHPAESSSTLPRTTSRFSHSELDTMPSTVTSPEAESS 880 

3 SAISTTISPGIPGVLTSLVTSSGRDISATFPTVPESPHESEATASWVTHPAVTSTTVPRTTPNYSHSEPDTTPSIATSPG 96 0 

■S AEATSDFPTITVSPDVPDMVTSQVTSSGTDTSITIPTLTLSSGEPETTTSFITYSETHTSSAIPTLPVSPGASKMLTSLV 1040 

% ISSGTDSTTTFPTLTETPYEPETTAIQLIHPAETNTMVPRTTPKFSHSKSDTTLPVAITSPGPEASSAVSTTTISPDMSD 112 0 

LVTSLVPSSGTDTSTTFPTLSETPYEPETTATWLTHPAETSTTVSGTIPNFSHRGSDTAPSMVTSPGVDTRSGVPTTTIP 120 0 

2S; PSIPGWTSQVTSSATDTSTAIPTLTPSPGEPETTASSATHPGTQTGFTVPIRTVPSSEPDTMASWVTHPPQTSTPVSRT 12 80 

TSSFSHSSPDATPVMATSPRTEASSAVLTTISPGAPEMVTSQITSSGAATSTTVPTLTHSPGMPETTALLSTHPRTETSK 136 0 

W TFPASTVFPQVSETTASLTIRPGAETSTALPTQTTSSLFTLLVTGTSRVDLSPTASPGVSAKTAPLSTHPGTETSTMIPT 1440 

HJ sTLSLGLLETTGLLATSSSAETSTSTLTLTVSPAVSGLSSASITTDKPQTVTSWNTETSPSVTSVGPPEFSRTVTGTTMT 1520 

^ LIPSEMPTPPKTSHGEGVSPTTILRTTMVEATNLATTGSSPTVAKTTTTFNTLAGSLFTPLTTPGMSTLASESVTSRTSY 1600 

30 nHRSWISTTSSYNRRYWTPATSTPVTSTFSPGISTSSIPSSTAATVPFMVPFTLNFTITNLQYEEDMRHPGSRKFNATER 16 8 0 

• S eLQGLLKPLFRNSSLEYLYSGCRLASLRPEKDSSAMAVDAICTHRPDPEDLGLDRERLYWELSNLTNGIQELGPYTLDRN 17 60 
fll SLYVNGFTHRSSMPTTSTPGTSTVDVGTSGTPSSSPSPT 

TABLE 19 B 

T TSTS TTT TTTT. • .TT TT. • ,T 80 

ST.-..TT 16 0 

! ! ! ! !s! ! . .T t.s t s s s,t. -s 240 

. .T,T TSS T S. .T.S. .TS S T T. . .TS. 32 0 

40 T.S T. .S TSS. . ,TST T STT T.S TT.S. 400 

.T TS.T. . .T S- .T. . .TTS SSS. . .T.T.ST T T.S 480 

rjrps^T SS...T.T.S S T T.S..TS S...T 560 

T T.S TT.S, .TS SST. .T.S TS . S T 640 

T.S..T SS...T.T.S...S S T T.SSS T.S. .TS S 72 0 

45 s S STT..T.T.SS TT S T 800 

S SS T T SSS T ST.T S 880 

S TT.S S....T S..T....T.... TSTT ...TT.,.S.S....T.S..TS.. 960 

^^rpS T T.T.SS T....T T.S T 1040 

.S. .T.STTT. .T.T.T T TT S S SS TT 112 0 

50 .S. .t! .STT. .T.S.T TT T ST TS S TT . . 1200 

.S T...TS..T.TST...T.T.S TT.SS.T T. . SS. . .T. . S . .T TST..S.T 1280 

TSS.S.SS. ..T TS..T,.SS T.S T. . .TS TSTT T.S ST.. .T..S. 1360 

ST S.TT.,.T ST... T. TT.S T.S...S ST . . . T . . ST . . . T 1440 

ST T. .S. .TSTS T S..S..S.,T T.TS..T..S.S.TS S T 1520 
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TABLE 19B-continued 



5 Serine/Threonine 0-glycosylation Pattern Predicted for the 

Amino Terminal End of the CA125 Molecule 



T . . . ST . . S 1600 

1680 

1760 

TTST. . .ST TS.T.SSS.S.T 



^ _S T. . • .S T TT.SS.T, 

10 TST. .TST.S. . -STSS. .SST. 



01 

m 

m 
IB 

0 

ru 

Q 
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TABLE 20 



Nucleotide and Amino Acid Sequences of Recombinant CA125 Repeat Showing Peptides 
(Underlined 1-4) which are Antigenically Matched for Immune Stimulation of 
Patients with the HLA-2 Histocompatibility Subtype 

CA 125 Recombinant Nucleotide and Amino Acid Sequences 
(SEQ ID NO: 151 and SEQ ID NO: 152, respectively) 
CA 125 Recombinant Nucleotide (Anti- Sense Strand) Sequence (SEQ ID NO: 153) 
Peptide 1 (SEQ ID NO: 154); Peptide 2 (SEQ ID NO: 155); 
Peptide 3 (SEQ ID NO: 156) and Peptide 4 (SEQ ID NO: 157) 

ATGAGAGGATCGCATCACCATCACCATCACGGATCCATGGGCCACACAGAGCCTGGCCCT 

1 + + + + + + 60 

TACTCTCCTAGCGTAGTGGTAGTGGTAGTGCCTAGGTACCCGGTGTGTCTCGGACCGGGA 

MRGSHHHHHHGSMGHTEPGP 

t 

CTCCTGATACCATTCACTTTCAACTTTACCATCACCAACCTGCATTATGAGGAAAACATG 

+ + + + + 120 

GAGGACTATGGTAAGTGAAAGTTGAAATGGTAGTGGTTGGACGTAATACTCCTTTTGTAC 

LLIPFTFNFTITNLHYEENM 

CAACACCCTGGTTCCAGGAAGTTCAACACCACGGAGAGGGTTCTGCAGGGTCTGCTCAAG 

121 + + + + + + 180 

GTTGTGGGACCAAGGTCCTTCAAGTTGTGGTGCCTCTCCCAAGACGTCCCAGACGAGTTC 

3 

QHPGSRKFNTTER V L Q G L L K 

CCCTTGTTCAAGAACACCAGTGTTGGCCCTCTGTACTCTGGCTGCAGACTGACCTTGCTC 

181 + + + + + + 240 

GGGAACAAGTTCTTGTGGTCACAACCGGGAGACATGAGACCGACGTCTGACTGGAACGAG 

p L FKNTSVGPLYSGCRLTLL 

AGACCTGAGAAGCATGAGGCAGCCACTGGAGTGGACACCATCTGTACCCACCGCGTTGAT 

241 + + + + + + ^00 

TCTGGACTCTTCGTACTCCGTCGGTGACCTCACCTGTGGTAGACATGGGTGGCGCAACTA 

RPEKHEAATGVDTICTHRVD 

CCCATCGGACCTGGACTGGACAGAGAGCGGCTATACTGGGAGCTGAGCCAGCTGACCAAC 

301 + + + + + + 360 

GGGTAGCCTGGACCTGACCTGTCTCTCGCCGATATGACCCTCGACTCGGTCGACTGGTTG 

1 4 

PIGPGLDRE |R L Y W E L S Q L "[t N - 



AGCATCACAGAGCTGGGACCCTACACCCTGGACAGGGACAGTCTCTATGTCAATGGCTTC 

361 + + + + + + 420 

TCGTAGTGTCTCGACCCTGGGATGTGGGACCTGTCCCTGTCAGAGATACAGTTACCGAAG 
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TABLE 20 (continued) 



Nucleotide and Amino Acid Sequences of Recombinant CA125 Repeat Showing Peptides 
(Underlined 1-4) which are Antigenically Matched for Immune Stimulation of 
Patients with the HLA-2 Histocompatibility Subtype 

CA 125 Recombinant Nucleotide and Amino Acid Sequences 
(SEQ ID NO: 151 and SEQ ID NO: 152, respectively) 
CA 125 Recombinant Nucleotide (Anti-Sense Strand) Sequence (SEQ ID NO: 153) 
Peptide 1 (SEQ ID NO: 154); Peptide 2 (SEQ ID NO: 155); 
Peptide 3 (SEQ ID NO: 156) and Peptide 4 (SEQ ID NO: 157) 



2 

S I T E L G P Y ^ ^ ^ ^^^mm V N G F 

AACCCTCGGAGCTCTGTGCCAACCACCAGCACTCCTGGGACCTCCACAGTGCACCTGGCA 

421 + + + + + + 

TTGGGAGCCTCGAGACACGGTTGGTGGTCGTGAGGACCCTGGAGGTGTCACGTGGACCGT 



N 



PRSSVPTTSTPGTSTVHLA 



ACCTCTGGGACTCCATCCTCCCTGCCT 

481 + + 507 

TGGAGACCCTGAGGTAGGAGGGACGGA 

TSGTPSSLP 



(SEQ ID NO: 154) 

Peptide 1 RLYWELSQL 



(SEQ ID NO: 155) 

Peptide 2 TLDRDSLYV 



(SEQ ID NO: 156) 

Peptide 3 VLQGLLKPL 



(SEQ ID NO: 157) 

Peptide 4 QLTNSITEL 
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TABLE 21 



CA125 Protein Sequence 
(SEQ ID NO: 162) 



1 MEHITKIPNE AAHRGTIRPV KGPQTSTSPA SPKGLHTGGT KRMETTTTAL ' 

51 KTTTTALKTT SRATLTTSVY TPTLGTLTPL NASRQMASTI LTEMMITTPY ' A 

101 VFPDVPETTS SLATSLGAET STALPRTTPS VLNRESETTA SLVSRSGAER " 1^ 

10 151 SPVIQTLDVS SSEPDTTASW VIHPAETIPT VSKTTPNFFH SELDTVSSTA | . 

201 TSHGADVSSA IPTNISPSEL DALTPLVTIS GTDTSTTFPT LTKSPHETET 

251 RTTWLTHPAE TSSTIPRTIP NFSHHESDAT PSIATSPGAE TSSAIPIMTV - H 

3 01 SPGAEDLVTS QVTSSGTDRN MTIPTLTLSP GEPKTIASLV THPEAQTSSA " q 

351 IPTSTISPAV SRLVTSMVTS LAAKTSTTNR ALTNSPGEPA TTVSLVTHPA 

15 4 01 QTSPTVPWTT SIFFHSKSDT TPSMTTSHGA ESSSAVPTPT VSTEVPGWT | 

451 PLVTSSRAVI STTIPILTLS PGEPETTPSM ATSHGEEASS AIPTPTVSPG 

501 VPGWTSLVT SSRAVTSTTI PILTFSLGEP ETTPSMATSH GTEAGSAVPT . >]- 

551 VLPEVPGMVT SLVASSRAVT STTLPTLTLS PGEPETTPSM ATSHGAEASS 

601 TVPTVSPEVP GWTSLVTSS SGVNSTSIPT LILSPGELET TPSMATSHGA 

651 EASSAVPTPT VSPGVSGWT PLVTSSRAVT STTIPILTLS SSEPETTPSM | r 

701 ATSHGVEASS AVLTVSPEVP GMVTSLVTSS RAVTSTTIPT LTISSDEPET . ^ 

751 TTSLVTHSEA KMISAIPTLA VSPTVQGLVT SLVTSSGSET SAFSNLTVAS ■ . 

801 SQPETIDSWV AHPGTEASSV VPTLTVSTGE PFTNISLVTH PAESSSTLPR 

ul g53_ TTSRFSHSEL DTMPSTVTSP EAESSSAIST TISPGIPGVL TSLVTSSGRD ' n 

Wx 901 ISATFPTVPE SPHESEATAS WVTHPAVTST TVPRTTPNYS HSEPDTTPSI I a 

951 ATSPGAEATS DFPTITVSPD VPDMVTSQVT SSGTDTSITI PTLTLSSGEP . 

1001 ETTTSFITYS ETHTSSAIPT LPVSPGASKM LTSLVISSGT DSTTTFPTLT i 

1051 ETPYEPETTA IQLIHPAETN TMVPRTTPKF SHSKSDTTLP VAITSPGPEA 

S 1101 SSAVSTTTIS PDMSDLVTSL VPSSGTDTST TFPTLSETPY EPETTATWLT \ 

3^0 1151 HPAETSTTVS GTIPNFSHRG SDTAPSMVTS PGVDTRSGVP TTTIPPSIPG 

1201 WTSQVTSSA TDTSTAIPTL TPSPGEPETT ASSATHPGTQ TGFTVPIRTV 

1251 PSSEPDTMAS WVTHPPQTST PVSRTTSSFS HSSPDATPVM ATSPRTEASS | O 

Jlr; 1301 AVLTTISPGA PEMVTSQITS SGAATSTTVP TLTHSPGNPE TTALLSTHPR ^ ^ 

1351 TETSKTFPAS TVFPQVSETT ASLTIRPGAE TSTALPTQTT SSLFTLLVTG 

14 01 TSRVDLSPTA SPGVSAKTAP LSTHPGTETS TMIPTSTLSL GLLETTGLLA I 

q 1451 TSSSAETSTS TLTLTVSPAV SGLSSASITT DKPQTVTSWN TETSPSVTSV - i 

iX 1501 GPPEFSRTVT GTTMTLIPSE MPTPPKTSHG EGVSPTTILR TTMVEATNLA | n 

1551 TTGSSPTVAK TTTTFNTLAG SLFTPLTTPG MSTLASSSVT SRTSYNHRSW 

1601 ISTTSSYNRR YWTPATSTPV TSTFSPGIST SSIPSSTA 



D 



40 
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TABLE 21 - continued 



CA125 Protein Sequence 
(SEQ ID NO: 162) 



AT VPFMVPFTLN 

1651 FTITNLQYEE DMRHPGSRKF NATERELQGL LKPLFRNSSL EYLYSGCRLA 

1701 SLRPEKDSSA MAVDAIC THR PDPEDLGLDR ERLYWELSNL TNGIQELGPY 

1751 TLDRNSLYVN GFTHRSSMPT TSTPGTSTVD VGTSGTPSSS PSPTAAGPLL 

1801 MPFTLNFTIT NLQYEEDMRR TGSRKFNTME SVLQGLLKPL FKNTSVGPLY 

1851 SG CRLTLLRP EKDGAATGVD AIC THRLDPK SPGLNREQLY WELSKLTNDI 

1901 EELGPYTLDR NSLYVNGFTH QSSVSTTSTP GTSTVDLRTS GTPSSLSSPT 

1951 IMAAGPLLVP FTLNFTITNL QYGEDMGHPG SRKFNTTERV LQGLLGPIFK 

2 0 01 NTSVGPLYSG CRLTSLRSEK DGAATGVDAI C IHHLDPKSP GLNRERLYWE 

2051 LSQLTNGIKE LGPYTLDRNS LYVNGFTHRT SVPTSSTPGT STVDLGTSGT 

2101 PFSLPSPATA GPLLVLFTLN FTITNLKYEE DMHRPGSRKF NTTERVLQTL 

2151 LGPMFKNTSV GLLYSG CRLT LLRSEKDGAA TGVDAIC THR LDPKSPGLDR 

22 01 EQLYWELSQL TNGIKELGPY TLDRNSLYVN GFTHWIPVPT SSTPGTSTVD 
2251 LGSGTPSSLP SPTAAGPLLV PFTLNFTITN LQYEEDMHHP GSRKFNTTER 

23 01 VLQGLLGPMF KNTSVGLLYS G CRLTLLRSE KDGAATGVDA IC THRLDPKS 
2351 PGVDREQLYW ELSQLTNGIK ELGPYTLDRN SLYVNGFTHQ TSAPNTSTPG 
2401 TSTVDLGTSG TPSSLPSPTS AGPLLVPFTL NFTITNLQYE EDMRHPGSRK 
2451 FNTTERVLQG LLKPLFKSTS VGPLYSG CRL TLLRSEKDGA ATGVDAIC TH 
2501 RLDPKSPGVD REQLYWELSQ LTNGIKSLGP YTLDRNSLYV NGFTHQTSAP 
2 551 NTSTPGTSTV DLGTSGTPSS LPSPTSAGPL LVPFTLNFTI TNLQYEEDMH 
26 01 HPGSRKFNTT ERVLQGLLGP MFKNTSVGLL YSG CRLTLLR PEKNGAATGM 
2651 DAIC SHRLDP KSPGLNREQL YWELSQLTHG IKELGPYTLD RNSLYVNGFT 
2701 HRSSVAPTST PGTSTVDLGT SGTPSSLPSP TTAVPLLVPF TLNFTITNLQ 
2751 YGEDMRHPGS RKFNTTERVL QGLLGPLFKN SSVGPLYSG C RLISLRSEKD 
2801 GAATGVDAIC THHLNPQSPG LDREQLYWQL SQMTNGIKEL GPYTLDRNSL 

2 851 YVNGFTHRSS GLTTSTPWTS TVDLGTSGTP SPVPSPTTAG PLLVPFTLNF 
2901 TITNLQYEED MHRPGSRKFN ATERVLQGLL SPIFKNSSVG PLYSG CRLTS 
2951 LRPEKDGAAT GMDAVC LYHP NPKRPGLDRE QLYWELSQLT HNITELGPYS 

3 001 LDRDSLYVNG FTHQNSVPTT STPGTSTVYW ATTGTPSSFP GHTEPGPLLI 
3051 PFTFNFTITN LHYEENMQHP GSRKFNTTER VLQGLLKPLF KNTSVGPLYS 
3101 G CRLTSLRPE KDGAATGMDA VC LYHPMPKR PGLDREQLYC ELSQLTHNIT 
3151 ELGPYSLDRD SLYVNGFTHQ NSVPTTSTPG TSTVYWATTG TPSSFPGHTE 

32 01 PGPLLIPFTF NFTITNLHYE ENMQHPGSRK FNTTERVLQG LLKPLFKNTS 
3251 VGPLYS GCRL TLLRPEKHEA ATGVDTIC TH RVDPIGPGLD RERLYWELSQ 

33 01 LTNSITELGP YTLDRDSLYV NGFNPRSSVP TTSTPGTSTV HLATSGTPSS 
3351 LPGHTAPVPL LIPFTLNFTI TNLHYEENMQ HPGSRKFNTT ERVLQGLLKP 
3401 LFKNTSVGPL YSG CRLTLLR PEKHEAATGV DTIC THRVDP IGPGLDREXL 
3451 YWELSXLTXX IXELGPYXLD RXSLYVNGFX XXXXXXXTST PGTSXVXLXT 
3501 SGTPXXXPXX TSAGPLLVPF TLNFTITNLQ YEEDMHHPGS RKFNTTERVL 
3 551 QGLLGPMFKN TSVGLLYSG C RLTLLRPEKN GAATGMDAIC SHRLDPKSPG 
3601 LDREQLYWEL SQLTHGIKEL GPYTLDRNSL YVNGFTHRSS VAPTSTPGTS 
3651 TVDLGTSGTP SSLPSPTTAV PLLVPFTLNF TITNLQYGED MRHPGSRKFN 
3 701 TTERVLQGLL GPLFKNSSVG PLYSG CRLIS LRSEKDGAAT GVDAIC THHL 
3751 NPQSPGLDRE QLYWQLSQMT NGIKELGPYT LDRNSLYVNG FTHRSSGLTT 
3 801 STPWTSTVDL GTSGTPSPVP SPTTAGPLLV PFTLNFTITN LQYEEDMHRP 
3 851 GSRKFNATER VLQGLLSPIF KNSSVGPLYS G CRLTSLRPE KDGAATGMDA 
3 901 VCLYHPNPKR PGLDREQLYW ELSQLTHNIT ELGPYSLDRD SLYVNGFTHQ 
3951 SSMTTTRTPD TSTMHLATSR TPASLSGPTT ASPLLVLFTI NCTITNLQYE 
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TABLE 21 - continued 



CA125 Protein Sequence 
(SEQ ID NO: 162) 



4 0 01 EDMRRTGSRK FNTMESVLQG LLKPLFKNTS VGPLYSG CRL TLLRPKKDGA 

4051 ATGVDAICTH RLDPKSPGLN REQLYWELSK LTNDIEELGP YTLDRNSLYV 

4101 NGFTHQSSVS TTSTPGTSTV DLRTSGTPSS LSSPTIMXXX PLLXPFTLNF 

4151 TITNLXYEEX MXXPGSRKFN TTERVLQGLL RPLFKNTSVS SLYSGCRLTL 

4201 LRPEKDGAAT RVDAAC TYRP DPKSPGLDRE QLYWELSQLT HSITELGPYT 

4251 LDRVSLYVNG FNPRSSVPTT STPGTSTVHL ATSGTPSSLP GHTXX XPLL 

43 01 XPFTLNFTIT NLXYEEXMXX PGSRKFNTTE RVLQGLLKPL FRNSSLEYLY 

4351 SG CRLASLRP EKDSSAMAVD AIC THRPDPE DLGLDRERLY WELSNLTNGI 

4401 QELGPYTLDR NSLYVNGFTH RSSFLTTSTP WTSTVDLGTS GTPSPVPSPT 

4451 TAGPLLVPFT LNFTITNLQY EEDMHRPGSR RFNTTERVLQ GLLTPLFKNT 

4501 SVGPLYS GCR LTLLRPEKQE AATGVDTIC T HRVDPIGPGL DRERLYWELS 

4551 QLTNSITELG PYTLDRDSLY VNGFNPWSSV PTTSTPGTST VHLATSGTPS 

4601 SLPGHTAPVP LLIPFTLNFT ITDLHYEENM QHPGSRKFNT TERVLQGLLK 

4651 PLFKSTSVGP LYSG CRLTLL RPEKHGAATG VDAIC TLRLD PTGPGLDRER 

47 01 LYWELSQLTN SVTELGPYTL DRDSLYVNGF THRSSVPTTS IPGTSAVHLE 

4751 TSGTPASLPG HTAPGPLLVP FTLNFTITNL QYEEDMRHPG SRKFSTTERV 

4801 LQGLLKPLFK NTSVSSLYSG CRLTLLRPEK DGAATRVDAV C THRPDPKSP 

4 851 GLDRERLYWK LSQLTHGITE LGPYTLDRHS LYVNGFTHQS SMTTTRTPDT 

4 901 STMHLATSRT PASLSGPTTA SPLLVLFTIN FTITNQRYEE NMHHPGSRKF 

4 951 NTTERVLQGL LRPVFKNTSV GPLYSG CRLT LLRPKKDGAA TKVDAIC TYR 

5001 PDPKSPGLDR EQLYWELSQL THSITELGPY TQDRDSLYVN GFTHRSSVPT 

5051 TSIPGTSAVH LETSGTPASL PGHTAPGPLL VPFTLNFTIT NLQYEEDMRH 

5101 PGSRKFNTTE RVLQGLLKPL FKSTSVGPLY SG CRLTLLRP EKRGAATGVD 

5151 TICTHRLDPL NPGLDREQLY WELSKLTRGI lELGPYLLDR GSLYVNGFTH 

52 01 RTSVPTTSTP GTSTVDLGTS GTPFSLPSPA XXXPLLXPFT LNFTITNLXY 

52 01 EEXMXXPGSR KFNTTERVLQ TLLGPMFKNT SVGLLYSG CR LTLLRSEKDG 

5251 AATGVDAICT HRLDPKSPGV DREQLYWELS QLTNGIKELG PYTLDRNSLY 

5301 WGFTHWIPV PTSSTPGTST VDLGSGTPSL PSSPTTAGPL LVPFTLNFTI 

5351 TNLKYEEDMH CPGSRKFNTT ERVLQSLLGP MFKNTSVGPL YSG CRLTLLR 

5401 SEKDGAATGV DAIC THRLDP KSPGVDREQL YWELSQLTNG IKELGPYTLD 

5451 RNSLYVNGFT HQTSAPNTST PGTSTVDLGT SGTPSSLPSP TXXXPLLXPF 

5501 TLNFTITNLX YEEXMXXPGS RKFNTTERVL QGLLXPXFKX TSVGXLYSGC 

5551 RLTLLRXEKX XAATXVDXXC XXXXDPXXPG LDREXLYWEL SXLTXXIXEL 

56 01 GPYXLDRXSL YVNGFTHWIP VPTSSTPGTS TVDLGSGTPS SLPSPTTAGP 

5651 LLVPFTLNFT ITNLKYEEDM HCPGSRKFNT TERVLQSLLG PMFKNTSVGP 

5701 LYSG CRLTSL RSEKDGAATG VDAIC THRVD PKSPGVDREQ LYWELSQLTN 

5751 GIKELGPYTL DRNSLYVNGF THQTSAPNTS TPGTSTVDLG TSGTPSSLPS 
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58 01 PTSAGPLLVP FTLNFTITNL QYEEDMHHPG SRKFNTTERV LQGLLGPMFK 

5851 NTSVGLLYSG CRLTLLRPEK NGAATGMDAI C THRLDPKSP GLDREXLYWE 

5 901 LSXLTXXIXE LGPYXLDRXS LYVNGFXXXX XXXXTSTPGT SXVXLXTSGT 
5951 PXXXPXXTXX XPLLXPFTLN FTITNLXYEE XMXXPGSRKF NTTERVLQGL 
6001 LKPLFRNSSL BYLYS GCRLA SLRPEKDSSA MAVDAIC THR PDPEDLGLDR 
6051 ERLYWELSNL TNGIQELGPY TLDRNSLYVN GFTHRSSMPT TSTPGTSTVD 
6101 VGTSGTPSSS PSPTTAGPLL IPFTLNFTIT NLQYGEDMGH PGSRKFNTTE 
6151 RVLQGLLGPI FKNTSVGPLY SG CRLTSLRS EKDGAATGVD AIC IHHLDPK 
6201 SPGLNRERLY WELSQLTNGI KELGPYTLDR NSLYVNGFTH RTSVPTTSTP 
6251 GTSTVDLGTS GTPFSLPSPA TAGPLLVLFT LNFTITNLKY EEDMHRPGSR 
6301 KFNTTERVLQ TLLGPMFKNT SVGLLYS GCR LTLLRSEKDG AATGVDAIC T 
63 51 HRLDPKSPGL DREXLYWELS XLTXXIXELG PYXLDRXSLY WGFXXXXXX 
6401 XXTSTPGTSX VXLXTSGTPX XXPXXTXXXP LLXPFTLNFT ITNLXYEEXM 
6451 XXPGSRKFNT TERVLQGLLR PVFKNTSVGP LYSG CRLTLL RPKKDGAATK 
6501 VDAICTYRPD PKSPGLDREQ LYWELSQLTH SITELGPYTQ DRDSLYVNGF 
6551 THRSSVPTTS IPGTSAVHLE TTGTPSSFPG HTEPGPLLIP FTFNFTITNL 
6601 RYEENMQHPG SRKFNTTERV LQGLLTPLFK NTSVGPLYSG CRLTLLRPEK 
6651 QEAATGVDTI C THRVDPIGP GLDRERLYWE LSQLTNSITE LGPYTLDRDS 
67 01 LYVDGFNPWS SVPTTSTPGT STVHLATSGT PSPLPGHTAP VPLLIPFTLN 

67 51 FTITDLHYEE NMQHPGSRKF NTTERVLQGL LKPLFKSTSV GPLYSGCRLT 

68 01 LLRPEKHGAA TGVDAIC TLR LDPTGPGLDR SRLYWELSQL TNSITELGPY 
68 51 TLDRDSLYVN GFNPWSSVPT TSTPGTSTVH LATSGTPSSL PGHTTAGPLL 

6 901 VPFTLNFTIT NLKYEEDMHC PGSRKFNTTE RVLQSLHGPM FKNTSVGPLY 
6 951 SG CRLTLLRS EKDGAATGVD AIC THRLDPK SPGLDREXLY WELSXLTXXI 
7001 XELGPYXLDR XSLYVNGFXX XXXXXXTSTP GTSXVXLXTS GTPXXXPXXT 
7051 XXXPLLXPFT LNFTITNLXY EEXMXXPGSR KFNTTERVLQ GLLXPXFKXT 
7101 SVGXLYSG CR LTLLRXEKXX AATXVDXXC X XXXDPXXPGL DREXLYWELS 
7151 XLTNSITELG PYTLDRDSLY VNGFTHRSSM PTTSIPGTSA VHLETSGTPA 
7201 SLPGHTAPGP LLVPFTLNFT ITNLQYEEDM RHPGSRKFNT TERVLQGLLK 
7251 PLFKSTSVGP LYSG CRLTLL RPEKRGAATG VDTIC THRLD PLNPGLDREX 

73 01 LYWELSXLTX XIXELGPYXL DRXSLYVNGF XXXXXXXXTS TPGTSXVXLX 
7351 TSGTPXXXPX XTXXXPLLXP FTLNFTITNL XYEEXMXXPG SRKFNTTERV 

74 01 LQGLLXPXFK XTSVGXLYSG CRLTLLRXEK XXAATXVDXX CX XXXDPXXP 
74 51 GLDREXLYWE LSXLTXXIXE LGPYXLDRXS LYVNGFHPRS SVPTTSTPGT 
7501 STVHLATSGT PSSLPGHTAP VPLLIPFTLN FTITNLHYEE NMQHPGSRKF 
7551 NTTERVLQGL LGPMFKNTSV GLLYSG CRLT LLRPSKNGAA TGMDAIC SHR 
7601 LDPKSPGLDR EXLYWELSXL TXXIXELGPY XLDRXSLYVN GFXXXXXXXX 
7651 TSTPGTSXVX LXTSGTPXXX PXXTXXXPLL XPFTLNFTIT NLXYEEXMXX 
7701 PGSRKFNTTE RVLQGLLXPX FKXTSVGXLY SG CRLTLLRX BKXXAATXVD 
7751 XXCXXXXDPX XPGLDREXLY WELSXLTXXI XELGPYXLDR XSLYVNGFTH 
7801 QNSVPTTSTP GTSTVYWATT GTPSSFPGHT EPGPLLIPFT FNFTITNLHY 
7851 EENMQHPGSR KFNTTERVLQ GLLTPLFKNT SVGPLYSG CR LTLLRPSKQE 
7901 AATGVDTIC T HRVDPIGPGL DREXLYWELS XLTXXIXELG PYXLDRXSLY 
7951 VNGFXXXXXX XXTSTPGTSX VXLXTSGTPX XXPXXTXXXP LLXPFTLNFT 
8 001 ITNLXYEEXM XXPGSRKFNT TERVLQGLLX PXFKXTSVGX LYS GCRLTLL 
8 051 RXEKXXAATX VDXXCX XXXD PXXPGLDREX LYWELSXLTX XIXELGPYXL 
8101 DRXSLYVNGF THRSSVPTTS SPGTSTVHLA TSGTPSSLPG HTAPVPLLIP 
8151 FTLNFTITNL HYEENMQHPG SRKFNTTERV LQGLLKPLFK STSVGPLYSG 
8201 CRLTLLRPEK HGAATGVDAI C TLRLDPTGP GLDREXLYWE LSXLTXXIXE 
82 51 LGPYXLDRXS LYVNGFXXXX XXXXTSTPGT SXVXLXTSGT PXXXPXXTXX 
8301 XPLLXPFTLN FTITNLXYEE XMXXPGSRKF NTTERVLQGL LXPXFKXTSV 
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8351 GXLYSG CRLT LLRXEKXXAA TXVDXXC XXX XDPXXPGLDR EXLYWELSXL 

8401 TXXIXELGPY XLDRXSLYVN GFTHRTSVPT TSTPGTSTVH liATSGTPSSL 

8451 PGHTAPVPLL IPFTLNFTIT NLQYEEDMHR PGSRKFNTTE RVLQGLLSPI 

85 01 FKNSSVGPLY SG CRLTSLRP EKDGAATGMD AVC LYHPNPK RPGLDREQLY 
8551 CELSQLTHNI TELGPYSLDR DSLYVNGFTH QNSVPTTSTP GTSTVYWATT 

86 01 GTPSSFPGHT XXXPLLXPFT LNFTITNLXY EEXMXXPGSR KFNTTERVLQ 
8651 GLLXPXFKXT SVGXLYSG CR LTLLRXEKXX AATXVDXXC X XXXDPXXPGL 
8701 DREXLYWELS XLTXXIXELG PYXLDRXSLY VNGFTHWSSG LTTSTPWTST 
8751 VDLGTSGTPS PVPSPTTAGP LLVPFTLNFT ITNLQYEEDM HRPGSRKFNA 
8801 TERVLQGLLS PIFKNTSVGP LYS GCRLTLL RPEKQEAATG VDTIC THRVD 
8851 PIGPGLDREX LYWELSXLTX XIXELGPYXL DRXSLYVNGF XXXXXXXXTS 
8901 TPGTSXVXLX TSGTPXXXPX XTXXXPLLXP FTLNFTITNL XYEEXMXXPG 
8951 SRKFNTTERV LQGLLXPXFK XTSVGXLYSG CRLTLLRXEK XXAATXVDXX 
9001 CXXXXDPXXP GLDREXLYWE LSXLTXXIXE LGPYXLDRXS LYVNGFTHRS 
9051 FGLTTSTPWT STVDLGTSGT PSPVPSPTTA GPLLVPFTLN FTITNLQYEE 
9101 DMHRPGSRKF NTTERVLQGL LTPLFRWTSV SSLYSG CRLT LLRPEKDGAA 
9151 TRVDAVC THR PDPKSPGLDR EXLYWELSXL TXXIXELGPY XLDRXSLYVN 
9201 GFXXXXXXXX TSTPGTSXVX LXTSGTPXXX PXXTXXXPLL XPFTLNFTIT 
9251 NLXYEEXMXX PGSRKFNTTE RVLQGLLXPX FKXTSVGXLY SG CRLTLLRX 
93 01 EKXXAATXVD XXC XXXXDPX XPGLDREXLY WELSXLTXXI XELGPYXLDR 
9351 XSLYVNGFTH WIPVPTSSTP GTSTVDLGSG TPSSLPSPTT AGPLLVPFTL 
9401 NFTITNLQYG EDMGHPGSRK FNTTERVLQG LLGPIFKNTS VGPLYSGCRL 
9451 TSLRSEKDGA ATGVDAIC IH HLDPKSPGLD RSXLYWELSX LTXXIXELGP 

95 01 YXLDRXSLYV NGFXXXXXXX XTSTPGTSXV XLXTSGTPXX XPXXTXXXPL 
9551 LXPFTLNFTI TNLXYEEXMX XPGSRKFNTT ERVLQGLLXP XFKXTSVGXL 

96 01 YSG CRLTLLR XEKXXAATXV DXXCX XXXDP XXPGLDREXL YWELSXLTXX 
9651 IXELGPYXLD RXSLYVNGFT HQTFAPNTST PGTSTVDLGT SGTPSSLPSP 
9701 TSAGPLLVPF TLNFTITNLQ YEEDMHHPGS RKFNTTERVL QGLLGPMFKN 
9751 TSVGLLYSG C RLTLLRPEKN GAATRVDAVC THRPDPKSPG LDREXLYWEL 
9801 SXLTXXIXEL GPYXLDRXSL YVNGFXXXXX XXXTSTPGTS XVXLXTSGTP 
9851 XXXPXXTAPV PLLIPFTLNF TITNLHYEEN MQHPGSRKFN TTERVLQGLL 
9901 RPLFKSTSVG PLYSG CRLTL LRPEKHGAAT GVDAIC TLRL DPTGPGLDRE 
9951 RLYWELSQLT NSVTELGPYT LDRDSLYVNG FTQRSSVPTT SIPGTSAVHL 

10001 ETSGTPASLP GHTAPGPLLV PFTLNFTITN LQYEVDMRHP GSRKFMTTER 

10051 VLQGLLKPLF KSTSVGPLYS G CRLTLLRPE KRGAATGVDT IC THRLDPLN 

10101 PGLDREQLYW ELSKLTRGII ELGPYLLDRG SLYVNGFTHR NFVPITSTPG 

10151 TSTVHLGTSE TPSSLPRPIV PGPLLVPFTL NFTITNLQYE EAMRHPGSRK 

10201 FNTTERVLQG LLRPLFKNTS IGPLYSS CRL TLLRPEKDKA ATRVDAIC TH 

10251 HPDPQSPGLN REQLYWELSQ LTHGITELGP YTLDRDSLYV DGFTHWSPIP 

103 01 TTSTPGTSIV NLGTSGIPPS LPETTXXXPL LXPFTLNFTI TNLXYEEXMX 

103 51 XPGSRKFNTT ERVLQGLLKP LFKSTSVGPL YSG CRLTLLR PEKDGVATRV 

10451 DAIC THRPDP KIPGLDRQQL YWELSQLTHS ITELGPYTLD RDSLYVNGFT 

10501 QRSSVPTTST PGTFTVQPET SETPSSLPGP TATGPVLLPF TLNFTITNLQ 

10551 YEEDMHRPGS RKFNTTERVL QGLLMPLFKN TSVSSLYSG C RLTLLRPEKD 

10601 GAATRVDAVC THRPDPKSPG LDRERLYWKL SQLTHGITEL GPYTLDRHSL 

10651 YVNGFTHQSS MTTTRTPDTS TMHLATSRTP ASLSGPTTAS PLLVLFTINF 

10701 TITNLRYEEN MHHPGSRKFN TTERVLQGLL RPVFKNTSVG PLYSG CRLTL 

10751 LRPKKDGAAT KVDAIC TYRP DPKSPGLDRE QLYWELSQLT HSITELGPYT 

108 01 QDRDSLYNVG FTQRSSVPTT SVPGTPTVDL GTSGTPVSKP GPSAASPLLV 

10851 LFTLNGTITN LRYEENMQHP GSRKFNTTER VLQGLLRSLF KSTSVGPLYS 

10901 GCRLTLLRPE KDGTATGVDA ICTHHPDPKS PRLDREQLYW ELSQLTHNIT 
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10951 ELGHYALDWD SLFVNGFTHR SSVSTTSTPG TPTVYLGASK TPASIFGPSA 

11001 ASHLLILFTL NFTITNLRYE ENMWPGSRKF NTTERVLQGL LRPLFKNTSV 

11051 GPLYSG SRLT LLRPEKDGEA TGVDAIC THR PDPTGPGLDR EQLYLELSQL 

11101 THSITELGPY TLDRDSLYVN GFTHRSSVPT TSTGWSEEP FTLNFTINNL 

11151 RYMADMGQPG SLKFNITDNV MKHLLSPLFQ RSSLGARYTG CRVIALRSVK 

11201 NGAETRVDLL C TYLQPLSGP GLPIKQVFHE LSQQTHGITR LGPYSLDKDS 

11251 LYLNGYNEPG LDEPPTTPKP ATTFLPPLSE ATTAMGYHLK TLTLNFTISN 

113 01 LQYSPDMGKG SATFNSTEGV LQHLLRPLFQ KSSMGPFYLG CQLISLRPEK 
11351 DGAATGVDTT C TYHPDPVGP GLDIQQLYWE LSQLTHGVTQ LGFYVLDRDS 
11401 LFINGYAPQN LSIRGEYQIN FHIVNWNLSN PDPTSSEY 

IT LLRDIQDKVT 

114 51 TLYKGSQLHD TFRFCLVTNL TMDSVLVTVK ALFSSNLDPS LVEQVFLDKT 
11501 LNASFHWLGS TYQLVDIHVT EMESSVYQPT SSSSTQHFYL NFTITNLPYS 
11551 QDKAQPGTTN YQRNKRNIED ALNQLFRNSS IKSYFSDCQV STFRSVPNRH 
11601 HTGVDSLCNF SPLARRVDRV AIYEEFLRMT RNGTQLQNFT LDRSSVLVDG 
11651 YSPNRNEPLT GNSDLPF WAV ILIGLAQLLG LITCLICGVL VTTRRRKKEG 
11701 EYMVQQQCPG YYQSHLDLED LQ 
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307) 






1 


ACTGCTGGCC 


CTCTCCTGGT 


GCCAT i CALL 


r^TT* A A r"T"T'r' A 




10 


51 


CCTGCAGTAT 


GAGGAGGACA 


TGCATCGCCC 


rpOO A T'/^T' A (^(^ 
i CjCjA i C 1 ALtLt 


A A piyTP A A PA 




101 


CCACAGAGAG 


GGTCCTGCAG 


GGTCTGCTTA 


GTCCCAlAi 1 


(^A A/^AAPAPP 




151 


AGTGTTGGCC 


CTCTGTACTC 


TGGCTGCAGA 


CTGALLlLiL 


T>P A PPTPTP A 

1 L AUIj 1 L 1 IjA 


15 
















201 


GAAGGATGGA 






PA TPTPP ATP 


P ATPATPTTn 




251 


ACCCCAAAAG 


CCCTGGACTC 


AACAGAGAGC 


GGCTGTACTG 


GGAGCTGAGC 


Ay: 


3 KJ d. 




ATGGCATCAA 


AGAGCTGGGC 


CCCTACACCC 


TGGACAGGAA 


Ui 


351 


CAGTCTCTAT 


GTCAATGGTT 


TCACCCATCG 


GACCTCTGTG 


CCCACCACCA 




401 


GCACTCCTGG 


GACCTCCACA 


GTGGACCTTG 


GAACCTCAGG 


GACTCCATTC 


1 


451 


TCCCTCCCAA 


GCCCCGCA 








a 
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308) 




35 


1 


TAGPLLVPFT 


LNFTITNLQY 


EEDMHRPGSR 


KFNTTERVLQ 


GLLSPIFKNT 




51 


SVGPLYSGCR 


LTSLRSEKDG 


AATGVDAICI 


HHLDPKSPGL 


NRERLYWELS 




101 


RLTNGIKELG 


PYTLDRNSLY 


VNGFTHRTSV 


PTTSTPGTST 


VDLGTSGTPF 



40 

151 SLPSPA 
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